Computing Information Gain for Continuous-Valued Attributes in data mining

In this tutorial, we will learn about the computing Information-Gain for Continuous-Valued Attributes.

First of all, lets see that what are continuous attributes?

Continuous attributes can be represented as floating  point variables. For example temperature, width, height, or weight of a body.

To calculate the split point is not a big deal. It is just a just a fun to find the split point. For example, we have the following data mentioned below;

How can we calculate the split point?

IncomeClass
18YES
45NO
18NO
25YES
28YES
28NO
34NO

Solution  to calculate the split point

Step 1:

First of all, we need to sort the data in ascending order. After sorting the data, data is shown in the table below.

IncomeClass
18YES
18NO
25YES
28YES
28NO
34NO
45NO

Step 2:

Find the midpoint of first two numbers and calculate the information gain

Split point = (18+25) / 2 = 21

  Infoincome<21(D) = 2/7(I(1,1)) + 5/7(I(2,3))

  = 2/7(-1/2(log2(1/2)) – 1/2(log2(1/2))+5/7(-2/5(log2(2/5)) – 3/5(log2(3/5)))

  = 0.98

Next Similar Tutorials

  1. Decision tree induction on categorical attributes  – Click Here
  2. Decision Tree Induction and Entropy in data mining – Click Here
  3. Overfitting of decision tree and tree pruning – Click Here
  4. Attribute selection Measures – Click Here
  5. Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
  6. Gini index for binary variables – Click Here
  7. Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
  8. Evaluation of a classifier by confusion matrix in data mining – Click Here
  9. Holdout method for evaluating a classifier in data mining – Click Here
  10. RainForest Algorithm / Framework – Click Here
  11. Boosting in data mining – Click Here
  12. Naive Bayes Classifier  – Click Here

 

Subscribe for Friendship

Latest posts by Prof. Fazal Rehman Shamil (see all)

Buy advertisement space on T4Tutorials

For more details email [email protected]