Correlation analysis of Nominal data with ChiSquare Test in Data Mining
ChiSquare Test
This analysis can be done by the chisquare test.A chisquare test is the test to analyze the correlation of nominal data.
Correlation VS Causality:
Correlation does not always tell us about causality.
Example:
 The number of students passed in exam and number of car theft in a country is correlated with each other but maybe it does not mean that the number of students passed effects car theft in a country.
But in some cases it may be;
 The number of students passed in the exam and the number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can be a cause of the student result.
[quads id=2]
Passed student  Not passed student  Sum  
Live near University  Observed=140 Expected = 180*330/1320 Expected =45 
Observed=190
Expected = 1140*330/1320 Expected =285 
330 
Not live near University  Observed=40
Expected = 180*990/1320 Expected =135 
Observed=950
Expected = 1140*990/1320 Expected =855 
990 
Sum  140 + 40 = 180  190 + 950 = 1140  1320 
Degrees of freedom:
DF = (r – 1) * (c – 1)
Level of significance:
.01  .05  .10 
Next Tutorials with Similar Topics
 Type of Data that can be mined – Click Here
 Attributes Types – Click Here
 Mean, Median, Mode – Click Here
 Estimated Mean, Median, Mode – Click Here
 Data Quartiles – Click Here
 Box Plot for Data – Click Here

Variance and standard deviation of data in data mining – Click Here Calculator – Click Here
 Data skewness – Click Here
 Correlation analysis of numerical data in Data Mining – Click Here
 Correlation analysis of Nominal data with ChiSquare Test in Data Mining – Click Here
 Data discretization and its techniques in data mining – Click Here