In this tutorial, we will learn about the computing Information-Gain for Continuous-Valued Attributes.

First of all, lets see that what are continuous attributes?

Continuous attributes can be represented as floating point variables. For example temperature, width, height, or weight of a body.

To calculate the split point is not a big deal. It is just a just a fun to find the split point. For example, we have the following data mentioned below;

How can we calculate the split point?

Income	Class
18	YES
45	NO
18	NO
25	YES
28	YES
28	NO
34	NO

Solution to calculate the split point

Step 1:

First of all, we need to sort the data in ascending order. After sorting the data, data is shown in the table below.

Income	Class
18	YES
18	NO
25	YES
28	YES
28	NO
34	NO
45	NO

Step 2:

Find the midpoint of first two numbers and calculate the information gain

Split point = (18+25) / 2 = 21

Infoincome<21(D) = 2/7(I(1,1)) + 5/7(I(2,3))

= 2/7(-1/2(log2(1/2)) – 1/2(log2(1/2))+5/7(-2/5(log2(2/5)) – 3/5(log2(3/5)))

= 0.98

Next Similar Tutorials

Decision tree induction on categorical attributes – Click Here
Decision Tree Induction and Entropy in data mining – Click Here
Overfitting of decision tree and tree pruning – Click Here
Attribute selection Measures – Click Here
Computing Information-Gain for Continuous-Valued Attributes in data mining – Click Here
Gini index for binary variables – Click Here
Bagging and Bootstrap in Data Mining, Machine Learning – Click Here
Evaluation of a classifier by confusion matrix in data mining – Click Here
Holdout method for evaluating a classifier in data mining – Click Here
RainForest Algorithm / Framework – Click Here
Boosting in data mining – Click Here
Naive Bayes Classifier – Click Here

T4Tutorials.com

Computing Information Gain for Continuous-Valued Attributes in data mining

How can we calculate the split point?

Solution to calculate the split point

Next Similar Tutorials

Related Posts: