By: Prof. Dr. Fazal Rehman | Last updated: March 3, 2022
What is Gini index?
Gini index is the most commonly used measure of inequality. Also referred as Gini ratio or Gini coefficient.Gini index for binary variables is calculated in the example below.
Student
inHostel
Target Class
Yes
True
Yes
Yes
True
Yes
Yes
False
No
False
False
Yes
False
True
No
False
True
No
False
False
No
True
False
Yes
False
True
No
Now we will calculate Gini index of student and inHostel.Step 1:Gini(X) = 1 – [(4/9)2 + (5/9)2] = 40/81Step 2:Gini(Student= False) = 1 – [(1/5)2 + (4/5)2] = 8/25Gini(Student= True) = 1 – [(3/4)2 + (1/4)2] = 3/8GiniGain(Student) = Gini(X) – [4/9· Gini(Student= True) + 5/9· Gini(Student= False)] = 0.149Step 3:Gini(inHostel= False) = 1 – [(2/4)2 + (2/4)2] = 1/2Gini(inHostel= True) = 1 – [(2/5)2 + (3/5)2] = 12/25GiniGain(inHostel) = Gini(X) – [5/9· Gini(inHostel= True) + 4/9· Gini(inHostel= False)] = 0.005ResultsBest split point is Student because it has high gini gain.
Next Similar Tutorials
Decision tree induction on categorical attributes – Click Here
Decision Tree Induction and Entropy in data mining – Click Here
Overfitting of decision tree and tree pruning – Click Here