1. What is the primary objective of k-Means clustering in data mining?
a) Classification
b) Regression
c) Clustering
d) Association rule mining
Answer: c) Clustering
2. How does the k-Means algorithm initialize cluster centroids?
a) Randomly
b) Using the mean of all data points
c) Based on the median data point
d) By choosing the farthest data points
Answer: a) Randomly
3. What is the role of the ‘k’ parameter in the k-Means algorithm?
a) Number of clusters to be formed
b) Distance metric used for clustering
c) Learning rate for centroid updates
d) Number of iterations for convergence
Answer: a) Number of clusters to be formed
4. How does the k-Means algorithm update cluster centroids during each iteration?
a) By calculating the mean of all data points in each cluster
b) By choosing the data point closest to the centroid
c) By merging clusters with similar centroids
d) By selecting the most central data point
Answer: a) By calculating the mean of all data points in each cluster
5. What is a major limitation of the k-Means algorithm?
a) It cannot handle large datasets
b) It requires a large number of clusters
c) It is sensitive to initial centroid positions
d) It cannot handle categorical data
Answer: c) It is sensitive to initial centroid positions
6. How does the k-Means algorithm determine convergence?
a) When the centroids stop moving significantly between iterations
b) When all data points are assigned to a cluster
c) After a fixed number of iterations
d) When the number of clusters equals ‘k’
Answer: a) When the centroids stop moving significantly between iterations
7. Which distance metric is commonly used in the k-Means algorithm?
a) Manhattan distance
b) Hamming distance
c) Euclidean distance
d) Cosine similarity
Answer: c) Euclidean distance
8. What is the computational complexity of the k-Means algorithm?
a) O(n)
b) O(n log n)
c) O(n^2)
d) O(n*k)
Answer: d) O(n*k), where n is the number of data points and k is the number of clusters.
9. Which of the following methods can help improve the performance of the k-Means algorithm?
a) Using a larger value of ‘k’
b) Reducing the number of iterations
c) Scaling the data to have equal variance
d) Initializing centroids close to the mean of the data
Answer: c) Scaling the data to have equal variance
10. What is the main advantage of the k-Means algorithm?
a) It is robust to outliers
b) It guarantees convergence to the global optimum
c) It can handle non-linear data
d) It is computationally efficient
Answer: d) It is computationally efficient
More Next Data Mining MCQs
- Repeated Data Mining MCQs
- Classification in Data mining MCQs
- Clustering in Data mining MCQs
- Data Analysis and Experimental Design MCQs
- Basics of Data Science MCQs
- Big Data MCQs
- Caret Data Science MCQs
- Binary and Count Outcomes MCQs
- CLI and Git Workflow
- Data Preprocessing MCQs
- Data Warehousing and OLAP MCQs
- Association Rule Learning MCQs
- Classification
- Clustering
- Regression MCQs
- Anomaly Detection MCQs
- Text Mining and Natural Language Processing (NLP) MCQs
- Web Mining MCQs
- Sequential Pattern Mining MCQs
- Time Series Analysis MCQs
Data Mining Algorithms and Techniques MCQs
- Frequent Itemset Mining MCQs
- Dimensionality Reduction MCQs
- Ensemble Methods MCQs
- Data Mining Tools and Software MCQs
- Python Programming for Data Mining MCQs (Pandas, NumPy, Scikit-Learn)
- R Programming for Data Mining(dplyr, ggplot2, caret) MCQs
- SQL Programming for Data Mining for Data Mining MCQs
- Big Data Technologies MCQs