- What is the primary application of the Naive Bayes algorithm in data mining?
a) Clustering
b) Regression
c) Classification
d) Association rule miningAnswer: c) Classification
- What assumption does the Naive Bayes algorithm make about the features in the dataset?
a) The features are highly correlated
b) The features are independent given the class
c) The features are dependent given the class
d) The features follow a linear relationshipAnswer: b) The features are independent given the class
- Which of the following distributions is commonly used with the Naive Bayes algorithm for continuous data?
a) Binomial distribution
b) Poisson distribution
c) Gaussian (Normal) distribution
d) Uniform distributionAnswer: c) Gaussian (Normal) distribution
- What does the Naive Bayes classifier compute to make predictions?
a) The probability of the class given the feature values
b) The likelihood of the feature values given the class
c) The overall frequency of each class
d) The mean and variance of each featureAnswer: a) The probability of the class given the feature values
- In the context of Naive Bayes, what does the term “prior probability” refer to?
a) The probability of a feature given the class
b) The initial probability of a class before observing any features
c) The joint probability of all features
d) The conditional probability of the class given the featuresAnswer: b) The initial probability of a class before observing any features
- Which of the following is a major advantage of the Naive Bayes algorithm?
a) It handles missing values well
b) It performs well with small datasets
c) It is computationally efficient and scalable
d) It can model complex relationships between featuresAnswer: c) It is computationally efficient and scalable
- What is a key limitation of the Naive Bayes algorithm?
a) It requires a large amount of computational resources
b) It assumes independence between features, which may not hold in real-world data
c) It cannot handle continuous data
d) It is difficult to implementAnswer: b) It assumes independence between features, which may not hold in real-world data
- In a Naive Bayes classifier, how are the probabilities for categorical features typically estimated?
a) Using Gaussian distribution
b) Using frequency counts from the training data
c) Using linear regression
d) Using the mean and varianceAnswer: b) Using frequency counts from the training data
- Which of the following best describes the “naive” aspect of the Naive Bayes algorithm?
a) It uses a simple linear model
b) It assumes all features contribute equally to the outcome
c) It assumes that features are independent given the class
d) It does not perform well with non-linear dataAnswer: c) It assumes that features are independent given the class
- What type of problems is the Naive Bayes algorithm particularly well-suited for?
a) High-dimensional data with many features
b) Regression problems with continuous output
c) Clustering problems with unknown labels
d) Problems with highly correlated featuresAnswer: a) High-dimensional data with many features