1. What is the primary objective of DBSCAN in data mining?
a) To classify data points into predefined clusters
b) To find the optimal number of clusters automatically
c) To reduce the dimensionality of the data
d) To perform regression analysis
Answer: a) To classify data points into predefined clusters
2. How does DBSCAN cluster data points?
a) By assigning each point to the nearest centroid
b) By forming clusters of high-density regions separated by low-density regions
c) By randomly assigning points to clusters
d) By using a dendrogram to visualize clusters
Answer: b) By forming clusters of high-density regions separated by low-density regions
3. What are the key parameters in DBSCAN?
a) Number of clusters and number of iterations
b) Epsilon (ε) and minimum points (MinPts)
c) Learning rate and convergence threshold
d) Regularization parameter and kernel type
Answer: b) Epsilon (ε) and minimum points (MinPts)
4. In DBSCAN, what does the parameter ε control?
a) The maximum number of points in a cluster
b) The minimum distance between points to be considered neighbors
c) The number of clusters to form
d) The size of the dataset
Answer: b) The minimum distance between points to be considered neighbors
5. What is noise in the context of DBSCAN?
a) Points that are not assigned to any cluster
b) Outliers that lie far from any cluster
c) Points with high-density values
d) Points that are in the center of clusters
Answer: a) Points that are not assigned to any cluster
6. How does DBSCAN handle clusters of varying densities?
a) It merges clusters with similar centroids
b) It uses a hierarchical approach
c) It adapts to the local density of points
d) It normalizes the dataset
Answer: c) It adapts to the local density of points
7. What does the term “core point” refer to in DBSCAN?
a) A point that has at least ε neighbors within a distance of ε
b) A point that is farthest from the centroid of a cluster
c) A centroid of a cluster
d) A point that is on the boundary of a cluster
Answer: a) A point that has at least ε neighbors within a distance of ε
8. Which of the following statements is true about DBSCAN?
a) It requires specifying the number of clusters beforehand
b) It is sensitive to the order of data points
c) It cannot handle non-linear data
d) It is computationally expensive
Answer: b) It is sensitive to the order of data points
9. What is the computational complexity of DBSCAN?
a) O(n log n)
b) O(n^2)
c) O(n)
d) O(n*k)
Answer: b) O(n^2), where n is the number of data points.
10. What is a major advantage of DBSCAN?
a) It requires a large number of clusters to be specified in advance
b) It can handle clusters of varying shapes and sizes
c) It is computationally expensive
d) It works well with high-dimensional data
Answer: b) It can handle clusters of varying shapes and sizes