Quantitative Data Analytics Qualification related questions

By: Prof. Dr. Fazal Rehman | Last updated: June 15, 2024

Which of the following is a statistical measure of the dispersion or variability in a dataset? A) Mean B) Median C) Variance D) Mode Answer: C) Variance What is the formula for calculating the coefficient of variation (CV)? A) (Standard Deviation / Mean) x 100 B) (Mean / Standard Deviation) x 100 C) Standard Deviation x Mean D) Mean + Standard Deviation Answer: A) (Standard Deviation / Mean) x 100 In data analytics, what does the term “correlation” measure? A) Causation between variables B) Relationship between variables C) Mean of variables D) Median of variables Answer: B) Relationship between variables Which statistical test is used to determine if there is a significant difference between the means of two groups? A) Chi-square test B) T-test C) ANOVA D) Regression analysis Answer: B) T-test In regression analysis, what does the coefficient of determination (R-squared) measure? A) Strength of the relationship between variables B) Direction of the relationship between variables C) Significance of the relationship between variables D) Variance explained by the regression model Answer: D) Variance explained by the regression model Which data visualization technique is used to show the distribution of a continuous variable? A) Histogram B) Bar chart C) Pie chart D) Line graph Answer: A) Histogram What does the term “Outlier” refer to in data analytics? A) An observation that is significantly different from other observations B) An observation that is similar to other observations C) An observation with a high correlation coefficient D) An observation with a low variance Answer: A) An observation that is significantly different from other observations Which of the following is NOT a type of data sampling technique? A) Random sampling B) Stratified sampling C) Cluster sampling D) Convenience sampling Answer: D) Convenience sampling What is the purpose of hypothesis testing in data analytics? A) To prove a hypothesis is true B) To validate data C) To assess the strength of a relationship D) To make decisions about population parameters based on sample data Answer: D) To make decisions about population parameters based on sample data Which of the following is a measure of central tendency? A) Range B) Standard Deviation C) Mode D) Variance Answer: C) Mode What does the term “Normal distribution” refer to in statistics? A) A distribution with a bell-shaped curve B) A distribution with a linear relationship C) A distribution with no outliers D) A distribution with a high variance Answer: A) A distribution with a bell-shaped curve Which statistical test is used to determine if there is a significant relationship between two categorical variables? A) T-test B) Chi-square test C) ANOVA D) Regression analysis Answer: B) Chi-square test What is the primary goal of data preprocessing in analytics? A) To increase the size of the dataset B) To reduce noise and improve data quality C) To add outliers to the dataset D) To remove all missing values Answer: B) To reduce noise and improve data quality What does the term “Cross-validation” refer to in machine learning? A) Training a model on one dataset and testing it on another B) Splitting data into training and testing sets C) Evaluating a model’s performance using multiple subsets of data D) Using multiple algorithms to build a model Answer: C) Evaluating a model’s performance using multiple subsets of data Which of the following is a measure of association used for categorical data? A) Pearson correlation coefficient B) Spearman’s rank correlation coefficient C) Coefficient of determination D) ANOVA Answer: B) Spearman’s rank correlation coefficient What is the purpose of data visualization in analytics? A) To make data look more complex B) To communicate insights effectively C) To hide outliers in the data D) To replace statistical analysis Answer: B) To communicate insights effectively Which data structure is used to store data in a hierarchical format? A) List B) Array C) Tree D) Queue Answer: C) Tree Which of the following is NOT a dimensionality reduction technique? A) Principal Component Analysis (PCA) B) Linear Regression C) t-SNE (t-distributed Stochastic Neighbor Embedding) D) Singular Value Decomposition (SVD) Answer: B) Linear Regression What is the purpose of A/B testing in data analytics? A) To compare two different datasets B) To assess the performance of a website or application C) To conduct hypothesis testing D) To calculate variance Answer: B) To assess the performance of a website or application Which of the following is a supervised learning algorithm? A) K-means clustering B) Decision tree C) Apriori algorithm D) DBSCAN Answer: B) Decision tree What does the term “Precision” refer to in classification models? A) The number of true positive predictions divided by the total number of positive predictions B) The number of true positive predictions divided by the total number of actual positives C) The number of true negative predictions divided by the total number of negative predictions D) The number of true negative predictions divided by the total number of actual negatives Answer: A) The number of true positive predictions divided by the total number of positive predictions Which of the following is a type of non-probability sampling technique? A) Random sampling B) Stratified sampling C) Snowball sampling D) Systematic sampling Answer: C) Snowball sampling What does the term “Data Mining” refer to in analytics? A) Extracting valuable information from data B) Storing data in a secure location C) Adding noise to data D) Removing outliers from data Answer: A) Extracting valuable information from data Which statistical test is used to determine if there is a significant difference between the means of more than two groups? A) T-test B) Chi-square test C) ANOVA D) Regression analysis Answer: C) ANOVA What is the primary goal of feature engineering in machine learning? A) To create new features from existing data B) To remove features from the dataset C) To increase model complexity D) To decrease model accuracy Answer: A) To create new features from existing data What does the term “Overfitting” refer to in machine learning? A) When a model performs well on training data but poorly on new data B) When a model performs poorly on training data C) When a model is too simple D) When a model is not trained properly Answer: A) When a model performs well on training data but poorly on new data Which of the following is NOT a classification algorithm? A) Logistic Regression B) K-nearest neighbors (KNN) C) Linear Regression D) Support Vector Machine (SVM) Answer: C) Linear Regression What is the purpose of regularization in machine learning? A) To penalize complex models B) To increase model bias C) To decrease model variance D) To simplify feature selection Answer: A) To penalize complex models Which of the following is a method for handling missing data in a dataset? A) Removing rows with missing data B) Replacing missing data with the mean of the column C) Ignoring missing data D) All of the above Answer: D) All of the above What does the term “Confusion Matrix” represent in classification models? A) A matrix that shows the relationship between variables B) A matrix that shows the performance of a classification model C) A matrix that shows the correlation between variables D) A matrix that shows the mean of variables Answer: B) A matrix that shows the performance of a classification model Which of the following is NOT a step in the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework? A) Data Understanding B) Data Visualization C) Data Preparation D) Model Evaluation Answer: B) Data Visualization What does the term “Logistic Regression” refer to in machine learning? A) A regression algorithm used for predicting continuous outcomes B) A regression algorithm used for predicting binary outcomes C) A classification algorithm used for predicting continuous outcomes D) A classification algorithm
All Copyrights Reserved 2025 Reserved by T4Tutorials