Table of Contents

# Correlation analysis of Nominal data with Chi-Square Test in Data Mining

## Chi-Square Test

This analysis can be done by chi-square test.Chi-square test is the test to analyze the correlation of nominal data.

## Correlation VS Causality:

Correlation does not always tell us about causality.

**Example:**

- The number of students passed in exam and number of car theft in a country is correlated with each other but maybe
**it does not**mean that number of student passed effects car theft in a country.

But in some cases it may be;

- The number of students passed in exam and number of students who live near to the university is correlated with each other and maybe a number of students who live near to the university can
**be a cause**of the student result.

| Passed student | Not passed student | Sum |

Live near University | Observed=140
Expected =45 | Observed=190
Expected =285 | 330 |

Not live near University | Observed=40
Expected =135 | Observed=950
Expected =855 | 990 |

Sum | 140 + 40 = 180 | 190 + 950 = 1140 | 1320 |

**Degrees of freedom:**

DF = (r – 1) * (c – 1)

Level of significance:

.01 | .05 | .10 |