Correlation analysis of Nominal data with Chi-Square Test in Data Mining

Correlation analysis of Nominal data with Chi-Square Test in Data Mining

Chi-Square Test

This analysis can be done by the chi-square test.A chi-square test is the test to analyze the correlation of nominal data.

Correlation VS Causality:

Correlation does not always tell us about causality.

Example:

  • The number of students passed in exam and number of car theft in a country is correlated with each other but maybe it does not mean that the number of students passed effects car theft in a country.

But in some cases it may be;

  • The number of students passed in the exam and the number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can be a cause of the student result.

[quads id=2]

  Passed student Not passed student Sum
Live near University Observed=140

Expected = 180*330/1320

Expected =45

Observed=190

Expected = 1140*330/1320

Expected =285

330
Not live near University Observed=40

Expected = 180*990/1320

Expected =135

Observed=950

Expected = 1140*990/1320

Expected =855

990
Sum
140 + 40 = 180 190 + 950 = 1140 1320

correlation analysis

Degrees of freedom:

DF = (r – 1) * (c – 1)

Level of significance:

.01 .05 .10

Next Tutorials with Similar Topics

  1. Type of Data that can be mined – Click Here
  2. Attributes Types – Click Here
  3. Mean, Median, Mode – Click Here
  4. Estimated Mean, Median, Mode – Click Here
  5. Data Quartiles – Click Here
  6. Box Plot for Data – Click Here
  7. Variance and standard deviation of data in data mining – Click Here   Calculator –  Click Here

  8. Data skewness – Click Here
  9. Correlation analysis of numerical data in Data Mining – Click Here
  10. Correlation analysis of Nominal data with Chi-Square Test in Data Mining – Click Here
  11. Data discretization and its techniques in data mining – Click Here

Add a Comment