Caret Data Science MCQs Questions Answers

By: Prof. Fazal Rehman Shamil

Which is helpful to generate balanced cross-validation groupings from a set of data?

a) createResample

b) createSample

c) createFolds

d) none of these

MCQ Answer: c


Which of the following is the wrong statement?

a) Three parameters are helpful to time series splitting

b) Simple random sampling of time series is possibly the greatest way to resample times series data.

c) Horizon parameter is the number of consecutive values in test set sample

d) All of these

MCQ Answer: b


Which of the following function can be helpful to maximize the minimum dissimilarities?

a) sumDiss

b) avgDiss

c) minDiss

d) All of these

MCQ Answer: d


Which function can create the indices for the time series type of splitting?

a) createTimeSlices

b) newTimeSlices

c) binTimeSlices

d) none of these

MCQ Answer: a


Which of the following is the correct statement?

a) Caret includes several functions to pre-process the predictor data

b) Asymptotics are helpful to inference typically

c) The function dummyVars can be helpful to generate a complete set of dummy variables from one or more factors

d) All of these

MCQ Answer: d


Which is helpful to create sub-samples using a maximum dissimilarity approach?

a) minDissim

b) inmaxDissim

c) maxDissim

d) All of these

MCQ Answer: c


caret does not use the proxy package.

a) True

b) False

MCQ Answer: b


Which function can be helpful to create balanced splits of the data?

a) newDataPartition

b) renameDataPartition

c) createDataPartition

d) none of these

MCQ Answer: c


Which package tools are present in caret?

a) model tuning

b) feature selection

c) pre-processing

d) All of these

MCQ Answer: d


caret stands for classification and regression training.

a) True

b) False

MCQ Answer: a


Which of the following function is a wrapper for dissimilar lattice plots to visualize the data?

a) featurePlot

b) levelplot

c) plotsample

d) None of these

MCQ Answer: a


Which of the following is the wrong statement?

a) In every situation, the data generating mechanism can create predictors that only have a single unique value of a matrix to enumerate sets of linear combinations

b) Predictors might have only a handful of unique values that occur with very low frequencies

c) The function findLinearCombos uses the QR decomposition

d) All of these

MCQ Answer: c


Which function can be helpful to identify near zero-variance variables?

a) nearZeroVar

b) nearVar

c) zeroVar

d) All of these

MCQ Answer: a


Which function can be helpful to flag predictors for removal?

a) searchCorrelation

b) findCorrelation

c) findCausation

d) none of these

MCQ Answer: b


Which of the following is the correct statement?

a) findLinearColumns will also return a vector of column positions that can be removed to eliminate the linear dependencies

b) the function findLinearRows can be helpful to generate a complete set of row variables from one factor

c) findLinearCombos will return a list that enumerates dependencies

d) None of these

MCQ Answer: c


Which can be helpful to impute data sets based only on information in the training set?

a) preProcess

b) postProcess

c) process

d) All of these

MCQ Answer: a


The function preProcess guesses the needed parameters for each operation.

a) True

b) False

MCQ Answer: a


Which of the following can also be helpful to find new variables that are linear combinations of the original set with independent components?

a) PCA

b) SCA

c) ICA

d) None of these

MCQ Answer: c


Which function is helpful to generate the class distances?

a) predict.classDist

b) preprocess.classDist

c) predict.classDistance

d) All of these

MCQ Answer: a


The preProcess class can be helpful to many operations on predictors.

a) True

b) False

MCQ Answer: a


varImp is a wrapper around the evimp function in which of the following package?

a) numpy

b) plot

c) earth

d) none of these

MCQ Answer: c


Which of the following is the wrong statement?

a) An argument, para, is helpful to choice the model fitting technique

b) For regression, the relationship between each predictor and the outcome is evaluated

c) The trapezoidal rule is helpful to compute the area under the ROC curve

d) All of these

MCQ Answer: a


Which of the following curve analysis is conducted on each predictor for classification?

a) NOC

b) COC

c) ROC

d) All of these

MCQ Answer: c


Which of the following function tracks the changes in model statistics?

a) findTrack

b) varImpTrack

c) varImp

d) none of these

MCQ Answer: c


Which of the following is the correct statement?

a) Boosted Trees uses a dissimilar approach as a single tree

b) The Bagged Trees output holds variable usage statistics

c) The difference between the class centroids and the overall centroid is helpful to measure the variable influence

d) None of these

MCQ Answer: c


What model includes a backward elimination feature selection routine?

a) MCV

b) MCRS

c) MARS

d) All of these

MCQ Answer: c


The benefit of using a model-based method is that is more closely tied to the model performance.

a) True

b) False

MCQ Answer: a


What model sums the importance over each boosting iteration?

a) Partial least squares

b) Bagged trees

c) Boosted trees

d) None of these

MCQ Answer: c


What argument is helpful to set important values?

a) set

b) scale

c) value

d) All of these

MCQ Answer: b


For most classification models, each predictor will have separate variable importance for each class.

a) True

b) False

MCQ Answer: a

Prof. Fazal Rehman Shamil