Last modified on April 17th, 2020
Correlation analysis of Nominal data with Chi-Square Test in Data Mining
This analysis can be done by the chi-square test.A chi-square test is the test to analyze the correlation of nominal data.
Correlation VS Causality:
Correlation does not always tell us about causality.
- The number of students passed in exam and number of car theft in a country is correlated with each other but maybe it does not mean that the number of students passed effects car theft in a country.
But in some cases it may be;
- The number of students passed in the exam and the number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can be a cause of the student result.
|Passed student||Not passed student||Sum|
|Live near University||Observed=140
Expected = 180*330/1320
Expected = 1140*330/1320
|Not live near University||Observed=40
Expected = 180*990/1320
Expected = 1140*990/1320
|Sum||140 + 40 = 180||190 + 950 = 1140||1320|
Degrees of freedom:
DF = (r – 1) * (c – 1)
Level of significance:
Next Tutorials with Similar Topics
- Type of Data that can be mined – Click Here
- Attributes Types – Click Here
- Mean, Median, Mode – Click Here
- Estimated Mean, Median, Mode – Click Here
- Data Quartiles – Click Here
- Box Plot for Data – Click Here
- Data skewness – Click Here
- Correlation analysis of numerical data in Data Mining – Click Here
- Correlation analysis of Nominal data with Chi-Square Test in Data Mining – Click Here
- Data discretization and its techniques in data mining – Click Here