Correlation analysis of Nominal data with Chi-Square Test in Data Mining
Correlation analysis of Nominal data with Chi-Square Test in Data Mining
Chi-Square Test
This analysis can be done by the chi-square test.A chi-square test is the test to analyze the correlation of nominal data.
Correlation VS Causality:
Correlation does not always tell us about causality.
Example:
- The number of students passed in exam and number of car theft in a country is correlated with each other but maybe it does not mean that the number of students passed effects car theft in a country.
But in some cases it may be;
- The number of students passed in the exam and the number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can be a cause of the student result.
[quads id=2]
Passed student | Not passed student | Sum | |
Live near University | Observed=140 Expected = 180*330/1320 Expected =45 | Observed=190 Expected = 1140*330/1320 Expected =285 | 330 |
Not live near University | Observed=40 Expected = 180*990/1320 Expected =135 | Observed=950 Expected = 1140*990/1320 Expected =855 | 990 |
Sum | 140 + 40 = 180 | 190 + 950 = 1140 | 1320 |
Degrees of freedom:
DF = (r – 1) * (c – 1)
Level of significance:
.01 | .05 | .10 |
Next Tutorials with Similar Topics
- Type of Data that can be mined – Click Here
- Attributes Types – Click Here
- Mean, Median, Mode – Click Here
- Estimated Mean, Median, Mode – Click Here
- Data Quartiles – Click Here
- Box Plot for Data – Click Here
Variance and standard deviation of data in data mining – Click Here Calculator – Click Here
- Data skewness – Click Here
- Correlation analysis of numerical data in Data Mining – Click Here
- Correlation analysis of Nominal data with Chi-Square Test in Data Mining – Click Here
- Data discretization and its techniques in data mining – Click Here