MCQs on Statistical Modeling | R

Statistical modeling in R is a critical skill for data analysis, allowing for predictive insights. This chapter covers key topics like linear and logistic regression, time series analysis, ARIMA models, and model validation.


Chapter: Statistical Modeling in R – MCQs

1. Linear and Logistic Regression

  1. What is the purpose of linear regression in R?
    • a) To predict categorical outcomes
    • b) To predict continuous outcomes
    • c) To calculate the correlation between variables
    • d) To cluster data
  2. Which function in R is used for fitting a linear regression model?
    • a) lm()
    • b) logit()
    • c) glm()
    • d) fit()
  3. What is the assumption underlying linear regression?
    • a) The relationship between variables is nonlinear
    • b) The residuals are normally distributed
    • c) The data must be categorical
    • d) The predictors must be independent
  4. Which of the following is true about logistic regression?
    • a) It is used for predicting continuous variables
    • b) It is used for predicting categorical variables
    • c) It assumes a normal distribution of errors
    • d) It does not require any transformations of the dependent variable
  5. What is the main output of a logistic regression model in R?
    • a) Coefficients and odds ratios
    • b) Predicted probabilities
    • c) Residuals
    • d) AUC score
  6. Which of the following is NOT a valid link function in logistic regression?
    • a) Logit
    • b) Probit
    • c) Identity
    • d) Exponential
  7. What is the interpretation of the coefficients in logistic regression?
    • a) They represent the mean difference between groups
    • b) They are the probabilities of the outcome event
    • c) They represent the odds ratio for a one-unit change in the predictor
    • d) They represent the R-squared value
  8. What does the “Null Deviance” in a logistic regression output represent?
    • a) The deviance of a model with only the intercept
    • b) The deviance of the fitted model
    • c) The predicted value of the model
    • d) The variance of the residuals
  9. How do you assess the goodness of fit in logistic regression?
    • a) Using R-squared
    • b) Using deviance or AIC
    • c) Using correlation coefficient
    • d) Using MSE (Mean Squared Error)
  10. What is the purpose of multicollinearity in linear regression?
    • a) It improves the model
    • b) It causes biased estimates of coefficients
    • c) It helps in model selection
    • d) It is not a concern in linear regression

2. Time Series Analysis and ARIMA Models

  1. What is a time series model used for?
    • a) To predict future values based on historical data
    • b) To cluster time-based data
    • c) To reduce dimensionality of time data
    • d) To visualize trends in time series
  2. What does ARIMA stand for?
    • a) Autoregressive Integrated Moving Average
    • b) Autoregressive Interpolation Moving Average
    • c) Automated Regression Integrated Moving Average
    • d) Autoregressive Independent Moving Average
  3. Which component of ARIMA is responsible for the autoregressive part?
    • a) AR
    • b) MA
    • c) I
    • d) T
  4. In an ARIMA model, what does the “I” stand for?
    • a) Interaction
    • b) Integration
    • c) Inference
    • d) Indicator
  5. What is the primary purpose of differencing in ARIMA models?
    • a) To transform the data into a stationary series
    • b) To reduce noise in the data
    • c) To adjust for seasonal effects
    • d) To estimate the model’s accuracy
  6. Which of the following is NOT a component of a typical time series?
    • a) Trend
    • b) Seasonal
    • c) Irregular
    • d) Logistic
  7. Which function is used to fit an ARIMA model in R?
    • a) auto.arima()
    • b) arima()
    • c) tslm()
    • d) lm()
  8. How do you check the stationarity of a time series in R?
    • a) Using the Augmented Dickey-Fuller (ADF) test
    • b) Using the Shapiro-Wilk test
    • c) By visualizing the series only
    • d) Using the chi-squared test
  9. What is a common diagnostic plot used to assess the residuals in an ARIMA model?
    • a) Histogram
    • b) Q-Q plot
    • c) Autocorrelation Function (ACF) plot
    • d) Boxplot
  10. What does the ACF (Autocorrelation Function) plot show?
    • a) The relationship between the residuals and fitted values
    • b) The correlation between observations at different time lags
    • c) The variance of the time series
    • d) The distribution of the residuals

3. Model Validation and Diagnostics

  1. Which of the following is used to assess the accuracy of a regression model in R?
    • a) R-squared
    • b) p-value
    • c) Coefficients
    • d) p-value
  2. What is cross-validation used for in statistical modeling?
    • a) To evaluate model performance using a subset of data
    • b) To estimate coefficients
    • c) To adjust for outliers
    • d) To check for multicollinearity
  3. Which metric is commonly used for model evaluation in classification tasks?
    • a) Mean Squared Error (MSE)
    • b) Confusion matrix
    • c) R-squared
    • d) F-test
  4. What is the purpose of the residuals in a regression model?
    • a) To estimate the error term
    • b) To check the assumption of homoscedasticity
    • c) To evaluate model accuracy
    • d) To check for overfitting
  5. What is the assumption of homoscedasticity in regression analysis?
    • a) The variance of the errors is constant across all levels of the independent variable
    • b) The errors are normally distributed
    • c) The data must be uncorrelated
    • d) The errors should follow a linear trend
  6. Which function is used to calculate the Variance Inflation Factor (VIF) in R?
    • a) vif()
    • b) vif.lm()
    • c) vif.stats()
    • d) calc_vif()
  7. Which diagnostic plot is used to check for normality in residuals?
    • a) Residual plot
    • b) Q-Q plot
    • c) ACF plot
    • d) Histogram plot
  8. What is the purpose of the p-value in statistical models?
    • a) To indicate the significance of the model coefficients
    • b) To estimate the variance
    • c) To check for linearity
    • d) To assess the quality of residuals
  9. Which test is used to detect multicollinearity in regression models?
    • a) VIF (Variance Inflation Factor)
    • b) Shapiro-Wilk test
    • c) t-test
    • d) Chi-squared test
  10. How do you assess the overfitting of a model?
    • a) By using the cross-validation technique
    • b) By checking residuals only
    • c) By adjusting the model parameters manually
    • d) By visualizing the data

Answers Table

QnoAnswer
1b) To predict continuous outcomes
2a) lm()
3b) The residuals are normally distributed
4b) It is used for predicting categorical variables
5a) Coefficients and odds ratios
6c) Identity
7c) They represent the odds ratio for a one-unit change in the predictor
8a) The deviance of a model with only the intercept
9b) Using deviance or AIC
10b) It causes biased estimates of coefficients
11a) To predict future values based on historical data
12a) Autoregressive Integrated Moving Average
13a) AR
14b) Integration
15a) To transform the data into a stationary series
16d) Logistic
17a) auto.arima()
18a) Using the Augmented Dickey-Fuller (ADF) test
19c) Autocorrelation Function (ACF) plot
20b) The correlation between observations at different time lags
21a) R-squared
22a) To evaluate model performance using a subset of data
23b) Confusion matrix
24a) To estimate the error term
25a) The variance of the errors is constant across all levels of the independent variable
26a) vif()
27b) Q-Q plot
28a) To indicate the significance of the model coefficients
29a) VIF (Variance Inflation Factor)
30a) By using the cross-validation technique

Use a Blank Sheet, Note your Answers and Finally tally with our answer at last. Give Yourself Score.

X
error: Content is protected !!
Scroll to Top