Major tasks of data pre-processing

Data Cleaning

Data cleaning is a process to clean the data in such a way that data can be easily integrated.

Data Integration

Data integration is a process to integrate/combine all the data.

Data Reduction

Data reduction is a process to reduce the large data into smaller once in such a way that data can be easily transformed further.

Data Transformation

Data transformation is a process to transform the data into a reliable shape.

Data Discretization 

Data discretization converts a large number of data values into smaller once, so that data evaluation and data management becomes very easy.

After the completion of these tasks, the data is ready for mining.

Important topics to know:

