Data mining normalization method

Let us see the Data Normalization before Data Mining.

There are different techniques to normalize the data. Some of the famous techniques are mentioned below.

  1. Z-Score Normalization
  2. Min-Max Normalization 
  3. Decimal scaling Normalization 
  4. Standard Deviation Normalization   

1. Z-Score Normalization

Z-Score helps in the normalization of data. If we normalize the data into a simpler form with the help of z score normalization, then it’s very easy to understand by our brains.

The basic z score formula for a sample is mentioned below;
z = (x – μ) / σ

2. Min Max normalization 

Min Max is a technique that helps to normalize the data. It will scale the data between 0 and 1. This normalization helps us to understand the data easily.

For example, if I say you to tell me the difference between 200 and 1000 then it’s a little bit confusing as compared to when I ask you to tell me the difference between 0.2 and 1.

Min Max Normalization Equation Pythone Matlab

3. Normalization with Decimal scaling 

Decimal scaling is a data normalization technique. In this technique, we move the decimal point of values of the attribute. This movement of decimal points totally depends on the maximum value among all values in the attribute.

A value v of attribute A is can be normalized by the following formula

Normalized value of attribute  = ( vi / 10j )

4. Standard Deviation normalization of data in data mining

Different values in the data set can be spread here and there from the mean. Variance tells us how much far away are the values from the mean.

Standard deviation is the square root of the variance.

  1. High standard deviation tells us that more numbers are far away from the mean.
  2. Low standard deviation tells us that fewer numbers are far away from the mean.
standard deviation formula
standard deviation formula