- What is the primary purpose of using decision trees in data mining?a) To cluster data points
b) To find frequent itemsets
c) To perform classification and regression tasks
d) To reduce dimensionalityAnswer: c) To perform classification and regression tasks
- In a decision tree, what does each internal node represent?a) A decision or test on an attribute
b) A final classification outcome
c) A data point
d) A set of rulesAnswer: a) A decision or test on an attribute
- What is the criterion used to split nodes in a decision tree?a) Support
b) Entropy
c) Mean squared error
d) Standard deviationAnswer: b) Entropy
- Which of the following is NOT a common algorithm used for building decision trees?a) ID3
b) C4.5
c) Apriori
d) CARTAnswer: c) Apriori
- What does the term “pruning” refer to in the context of decision trees?a) Adding more branches to the tree
b) Removing branches to reduce complexity and avoid overfitting
c) Splitting nodes based on an attribute
d) Combining similar branchesAnswer: b) Removing branches to reduce complexity and avoid overfitting
- In decision tree terminology, what is a “leaf node”?a) A node where a decision is made
b) A node that contains a split criterion
c) A node that represents a final decision or classification
d) A node that has multiple child nodesAnswer: c) A node that represents a final decision or classification
- What does the “Gini index” measure in decision trees?a) The level of impurity or disorder in a dataset
b) The accuracy of the model
c) The number of branches in the tree
d) The depth of the treeAnswer: a) The level of impurity or disorder in a dataset
- In the context of decision trees, what is “overfitting”?a) When the model is too simple and does not capture the underlying pattern
b) When the model is too complex and captures noise in the training data
c) When the model has too few branches
d) When the model is pruned excessivelyAnswer: b) When the model is too complex and captures noise in the training data
- Which of the following is an advantage of using decision trees?a) They are easy to understand and interpret
b) They require a lot of computational resources
c) They are not suitable for numerical data
d) They cannot handle missing valuesAnswer: a) They are easy to understand and interpret
- Which of the following is a disadvantage of decision trees?a) They are not suitable for categorical data
b) They can easily overfit the training data
c) They are difficult to interpret
d) They require a large amount of dataAnswer: b) They can easily overfit the training data