Correlation analysis of Nominal data with Chi-Square Test in Data Mining
This analysis can be done by chi-square test.Chi-square test is the test to analyze the correlation of nominal data.
Correlation VS Causality:
Correlation does not always tell us about causality.
- The number of students passed in exam and number of car theft in a country is correlated with each other but maybe it does not mean that number of student passed effects car theft in a country.
But in some cases it may be;
- The number of students passed in exam and number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can be a cause of the student result.
|Passed student||Not passed student||Sum|
|Live near University||Observed=140|
Expected = 180*330/1320
Expected = 1140*330/1320
|Not live near University||Observed=40|
Expected = 180*990/1320
Expected = 1140*990/1320
|Sum||140 + 40 = 180||190 + 950 = 1140||1320|
Degrees of freedom:
DF = (r – 1) * (c – 1)
Level of significance: