InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 1. |
Collection of exchangeable binary outcomes for the same covariate data are called _______ outcomes.(a) random(b) direct(c) binomial(d) none of the mentionedThis question was posed to me in an international level competition.I'd like to ask this question from Binary and Count Outcomes in division Statistical Inference and Regression Models of Data Science |
|
Answer» RIGHT option is (c) binomial Best EXPLANATION: The multivariate REGRESSION MODEL for binary outcomes gives ODDS ratios, not risk ratios. |
|
| 2. |
Point out the correct statement.(a) A standard error is needed to create a prediction interval(b) The prediction interval must incorporate the variability in the data around the line(c) Investors use the residual variance to measure the accuracy of their predictions on the value of an asset(d) All of the mentionedI have been asked this question in unit test.This interesting question is from Residual Variation and Multivariate in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT answer is (d) All of the mentioned The best explanation: In statistics, EXPLAINED variation MEASURES the proportion to which a mathematical model accounts for the variation of a GIVEN DATA set. |
|
| 3. |
Point out the correct statement.(a) The mean is a measure of central tendency of the data(b) Empirical mean is related to “centering” the random variables(c) The empirical standard deviation is a measure of spread(d) All of the mentionedI have been asked this question in a job interview.My doubt stems from Introduction to Regression Models topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT CHOICE is (d) All of the mentioned Explanation: The PROCESS of centering and SCALING the data is called “normalizing” the data. |
|
| 4. |
Which of the following value is the most common measure of “statistical significance”?(a) P(b) A(c) L(d) All of the mentionedThe question was asked in an online interview.This question is from Statistical Inference Concepts topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» Right answer is (a) P |
|
| 5. |
__________ random variables are used to model rates.(a) Empirical(b) Binomial(c) Poisson(d) All of the mentionedI have been asked this question by my school principal while I was bunking the class.I want to ask this question from Common Distributions topic in portion Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT option is (c) POISSON The BEST I can explain: Poisson distribution is USED to model counts. |
|
| 6. |
Which of the following is example use of Poisson distribution?(a) Analyzing contingency table data(b) Modeling web traffic hits(c) Incidence rates(d) All of the mentionedThe question was asked during a job interview.Question is taken from Binary and Count Outcomes in division Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (d) All of the mentioned |
|
| 7. |
Which of the following function is associated with a continuous random variable?(a) pdf(b) pmv(c) pmf(d) all of the mentionedThis question was addressed to me by my school principal while I was bunking the class.Query is from Introduction to Statistical Inference in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT answer is (a) PDF The best explanation: pdf stands for probability DENSITY FUNCTION. |
|
| 8. |
Point out the correct statement.(a) Bayesian inference is the use of Bayesian probability representation of beliefs to perform inference(b) NULL is the standard missing data marker used in S(c) Frequency inference is the use of Bayesian probability representation of beliefs to perform inference(d) None of the mentionedThe question was posed to me in final exam.My doubt stems from Introduction to Statistical Inference in portion Statistical Inference and Regression Models of Data Science |
|
Answer» The correct option is (a) Bayesian inference is the use of Bayesian probability representation of beliefs to PERFORM inference |
|
| 9. |
Which of the following refers to the circumstance in which the variability of a variable is unequal across the range of values of a second variable that predicts it?(a) Heterogeneity(b) Heteroskedasticity(c) Heteroelasticty(d) None of the mentionedI got this question by my college professor while I was bunking the class.Question is from Introduction to Regression Models topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» RIGHT option is (b) Heteroskedasticity Easiest EXPLANATION - Heteroskedasticity has SERIOUS CONSEQUENCES for the OLS estimator. |
|
| 10. |
Point out the wrong statement.(a) Asymptotics generally give assurances about finite sample performance(b) The sample variance and the sample standard deviation are consistent as well(c) The sample mean and the sample variance are unbiased as well(d) None of the mentionedI had been asked this question during an online interview.Question is taken from Likelihood in division Statistical Inference and Regression Models of Data Science |
|
Answer» Correct option is (a) ASYMPTOTICS generally give assurances about finite sample performance |
|
| 11. |
Which of the following random variables are the default model for random samples?(a) iid(b) id(c) pmd(d) all of the mentionedI had been asked this question in class test.This intriguing question comes from Probability and Statistics in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (a) iid |
|
| 12. |
Normalized data are centered at ___ and have units equal to standard deviations of the original data.(a) 0(b) 5(c) 1(d) 10I have been asked this question in unit test.I'm obligated to ask this question of Introduction to Regression Models topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» The correct choice is (a) 0 |
|
| 13. |
Point out the wrong statement with respect to FDR.(a) FDR is difficult to calculate(b) FDR is relatively less conservative(c) FDR allows for more false positives(d) None of the mentionedThis question was addressed to me during an interview.My question is from Statistical Inference Concepts topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» RIGHT ANSWER is (a) FDR is difficult to calculate Explanation: FDR STANDS for false DISCOVERY rate. |
|
| 14. |
What is the purpose of multiple testing in statistical inference?(a) Minimize errors(b) Minimize false positives(c) Minimize false negatives(d) All of the mentionedThis question was addressed to me during an interview.My doubt is from Statistical Inference Concepts in division Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT answer is (d) All of the mentioned |
|
| 15. |
Which of the following testing is concerned with making decisions using data?(a) Probability(b) Hypothesis(c) Causal(d) None of the mentionedThis question was addressed to me during an interview.The question is from Statistical Inference Concepts topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT choice is (b) Hypothesis Easy explanation - The null hypothesis is ASSUMED true and statistical evidence is required to REJECT it in favor of a research or ALTERNATIVE hypothesis. |
|
| 16. |
Which of the following goal is incorrectly represented in the below figure?(a) Relationship between variables(b) Distribution of variables(c) Inference about relationships(d) CausalThe question was asked at a job interview.The origin of the question is Common Distributions topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (d) Causal |
|
| 17. |
Which of the following can be useful for diagnosing data entry errors?(a) hat values(b) dffit(c) resid(d) all of the mentionedThe question was asked in a national level competition.This intriguing question comes from Residual Variation and Multivariate in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» Right CHOICE is (a) hat values |
|
| 18. |
Which of the following function can be replaced with the question mark in the below figure?(a) boxplot(b) lplot(c) levelplot(d) all of the mentionedThis question was posed to me in quiz.The origin of the question is Introduction to Regression Models in portion Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT choice is (c) levelplot |
|
| 19. |
Point out the correct statement.(a) The exponent of a normally distributed random variables follows what is called the log-normal distribution(b) Sums of normally distributed random variables are again normally distributed even if the variables are dependent(c) The square of a standard normal random variable follows what is called chi-squared distribution(d) All of the mentionedI have been asked this question in quiz.My question is based upon Common Distributions topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» The correct CHOICE is (d) All of the mentioned |
|
| 20. |
Point out the correct statement.(a) Some cumulative distribution function F is non-decreasing and right-continuous(b) Every cumulative distribution function F is decreasing and right-continuous(c) Every cumulative distribution function F is increasing and left-continuous(d) None of the mentionedThe question was posed to me in final exam.My question is based upon Probability and Statistics topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (d) NONE of the mentioned |
|
| 21. |
Principal components or factor analytic models on covariates are often useful for reducing complex covariate spaces.(a) True(b) FalseThe question was posed to me at a job interview.Question is taken from Binary and Count Outcomes in portion Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT OPTION is (a) True To explain: The space of MODELS explodes quickly as you add INTERACTIONS and polynomial TERMS. |
|
| 22. |
How many components are present in generalized linear models?(a) 2(b) 4(c) 6(d) None of the mentionedI had been asked this question in unit test.This intriguing question comes from Binary and Count Outcomes in section Statistical Inference and Regression Models of Data Science |
|
Answer» Correct answer is (d) NONE of the mentioned |
|
| 23. |
Which of the following is the correct formula for total variation?(a) Total Variation = Residual Variation – Regression Variation(b) Total Variation = Residual Variation + Regression Variation(c) Total Variation = Residual Variation * Regression Variation(d) All of the mentionedThe question was posed to me in an interview.The query is from Residual Variation and Multivariate topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» The correct choice is (b) Total VARIATION = Residual Variation + Regression Variation |
|
| 24. |
The _________ of a collection of data is the joint density evaluated as a function of the parameters with the data fixed.(a) probability(b) likelihood(c) poisson distribution(d) all of the mentionedThis question was addressed to me during an interview.My doubt is from Likelihood in portion Statistical Inference and Regression Models of Data Science |
|
Answer» Correct answer is (b) likelihood |
|
| 25. |
Point out the correct statement.(a) Power of a one sided test is lower than the power of the associated two sided test(b) Power of a two sided test is greater than the power of the associated one sided test(c) Hypothesis testing is less commonly used(d) None of the mentionedThe question was posed to me in an interview for job.This interesting question is from Statistical Inference Concepts topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» Right choice is (d) NONE of the mentioned |
|
| 26. |
Which of the following form the basis for frequency interpretation of probabilities?(a) Asymptotics(b) Symptotics(c) Asymmetry(d) All of the mentionedThis question was addressed to me in a job interview.My doubt is from Common Distributions topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» Right option is (a) Asymptotics |
|
| 27. |
For continuous random variables, the CDF is the derivative of the PDF.(a) True(b) FalseThis question was posed to me in semester exam.Question is from Probability and Statistics topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» Right answer is (b) False |
|
| 28. |
Which of the following condition should be satisfied by function for pmf?(a) The sum of all of the possible values is 1(b) The sum of all of the possible values is 0(c) The sum of all of the possible values is infinite(d) All of the mentionedThis question was posed to me by my college professor while I was bunking the class.I want to ask this question from Introduction to Statistical Inference topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT choice is (a) The sum of all of the possible values is 1 To explain I would SAY: A probability mass FUNCTION evaluated at a VALUE CORRESPONDS to the probability that a random variable takes that value. |
|
| 29. |
Which of the following component is involved in generalized linear models?(a) An exponential family model for the response(b) A systematic component via a linear predictor(c) A link function that connects the means of the response to the linear predictor(d) All of the mentionedI had been asked this question in unit test.My question comes from Binary and Count Outcomes topic in portion Statistical Inference and Regression Models of Data Science |
|
Answer» The correct option is (d) All of the mentioned |
|
| 30. |
Which of the following statement is incorrect with respect to outliers?(a) Outliers can have varying degrees of influence(b) Outliers can be the result of spurious or real processes(c) Outliers cannot conform to the regression relationship(d) None of the mentionedI had been asked this question in a job interview.This interesting question is from Residual Variation and Multivariate in division Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT choice is (c) OUTLIERS cannot CONFORM to the regression RELATIONSHIP |
|
| 31. |
Which of the following things can be accomplished with linear model?(a) Flexibly fit complicated functions(b) Uncover complex multivariate relationships(c) Build accurate prediction models(d) All of the mentionedThis question was addressed to me during a job interview.My query is from Residual Variation and Multivariate topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» The correct option is (d) All of the mentioned |
|
| 32. |
Point out the wrong statement.(a) Asymptotics are used for inference usually(b) Adding squared terms makes it continuously differentiable at the knot points(c) Adding squared terms makes it twice continuously differentiable at the knot points(d) None of the mentionedI had been asked this question in unit test.The query is from Binary and Count Outcomes in section Statistical Inference and Regression Models of Data Science |
|
Answer» Right answer is (C) Adding squared TERMS makes it TWICE continuously differentiable at the knot POINTS |
|
| 33. |
Residual ______ plots investigate normality of the errors.(a) RR(b) PP(c) QQ(d) None of the mentionedThe question was posed to me in a national level competition.I'm obligated to ask this question of Residual Variation and Multivariate in portion Statistical Inference and Regression Models of Data Science |
|
Answer» Right OPTION is (c) QQ |
|
| 34. |
Which of the following is correct with respect to residuals?(a) Positive residuals are above the line, negative residuals are below(b) Positive residuals are below the line, negative residuals are above(c) Positive residuals and negative residuals are below the line(d) All of the mentionedI have been asked this question in final exam.This intriguing question originated from Introduction to Regression Models topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (a) Positive residuals are above the LINE, negative residuals are below |
|
| 35. |
Which of the following is the oldest multiple testing correction?(a) Bonferroni correction(b) Bernoulli correction(c) Likelihood correction(d) All of the mentionedI got this question by my school principal while I was bunking the class.My query is from Statistical Inference Concepts topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT answer is (a) Bonferroni CORRECTION To EXPLAIN: Bonferroni correction is EASY to calculate. |
|
| 36. |
Chebyshev’s inequality states that the probability of a “Six Sigma” event is less than ___________(a) 10%(b) 20%(c) 30%(d) 3%I had been asked this question by my school principal while I was bunking the class.The doubt is from Probability and Statistics in portion Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT option is (d) 3% The best explanation: If a bell CURVE is ASSUMED, the probability of a “six sigma” EVENT is on the order of ONE ten millionth of a percent. |
|
| 37. |
Point out the wrong statement.(a) A percentile is simply a quantile with expressed as a percent(b) There are two types of random variable(c) R cannot approximate quantiles for you for common distributions(d) None of the mentionedI had been asked this question at a job interview.My question comes from Probability and Statistics topic in section Statistical Inference and Regression Models of Data Science |
|
Answer» The correct option is (c) R cannot APPROXIMATE quantiles for you for COMMON distributions |
|
| 38. |
Multivariate regression estimates are exactly those having removed the linear relationship of the other variables from both the regressor and response.(a) True(b) FalseI have been asked this question during an interview.The doubt is from Residual Variation and Multivariate topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» Right CHOICE is (a) True |
|
| 39. |
Point out the wrong statement.(a) The fraction of variance unexplained is an established concept in the context of linear regression(b) “Explained variance” is routinely used in principal component analysis(c) The general linear model extends simple linear regression (SLR) by adding terms linearly into the model(d) None of the mentionedThis question was addressed to me in class test.My query is from Residual Variation and Multivariate topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» RIGHT option is (d) None of the mentioned The best I can explain: LINEARITY REFERS to a mathematical relationship or function that can be graphically REPRESENTED as a STRAIGHT line. |
|
| 40. |
Minimizing the likelihood is the same as maximizing -2 log likelihood.(a) True(b) FalseThe question was asked in an interview for job.This question is from Introduction to Regression Models in portion Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT option is (a) True To EXPLAIN I WOULD say: Maximizing the LIKELIHOOD is the same as MINIMIZING 2 log likelihood. |
|
| 41. |
Point out the wrong statement.(a) Regression through the origin yields an equivalent slope if you center the data first(b) Normalizing variables results in the slope being the correlation(c) Least squares is not an estimation tool(d) None of the mentionedThe question was posed to me in exam.Asked question is from Introduction to Regression Models topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» RIGHT option is (c) LEAST SQUARES is not an estimation TOOL Explanation: Least squares is an estimation tool. |
|
| 42. |
The pooled estimator is a mixture of the group variances, placing greater weight on whichever has a larger sample size.(a) True(b) FalseI got this question in an interview for job.This question is from Statistical Inference Concepts topic in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT ANSWER is (a) True The best I can EXPLAIN: If the SAMPLE sizes are the same the pooled variance estimate is the average of the group VARIANCES. |
|
| 43. |
Which of the following can be considered as random variable?(a) The outcome from the roll of a die(b) The outcome of flip of a coin(c) The outcome of exam(d) All of the mentionedThis question was posed to me in exam.I'm obligated to ask this question of Introduction to Statistical Inference in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT CHOICE is (d) All of the mentioned |
|
| 44. |
Which of the following show residuals divided by their standard deviations?(a) rstudent(b) cooks.distance(c) rstandard(d) all of the mentionedI had been asked this question in exam.This intriguing question originated from Residual Variation and Multivariate in chapter Statistical Inference and Regression Models of Data Science |
|
Answer» The correct choice is (C) rstandard |
|
| 45. |
Which of the following tool is used for estimating standard errors and the bias of estimators?(a) knitr(b) jackknife(c) ggplot2(d) all of the mentionedI had been asked this question by my school principal while I was bunking the class.Asked question is from Statistical Inference Concepts topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT OPTION is (C) ggplot2 Easy EXPLANATION - jackknife involves RESAMPLING data. |
|
| 46. |
Gosset’s distribution is invented by which of the following scientist?(a) William Gosset(b) William Gosling(c) Gosling Gosset(d) All of the mentionedThe question was asked in examination.My enquiry is from Likelihood topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT ANSWER is (a) WILLIAM Gosset |
|
| 47. |
Bernoulli random variables take (only) the values 1 and 0.(a) True(b) FalseThis question was addressed to me in an international level competition.This intriguing question originated from Common Distributions in section Statistical Inference and Regression Models of Data Science |
|
Answer» Right answer is (a) True |
|
| 48. |
Which of the following is incorrect with respect to use of Poisson distribution?(a) Modeling event/time data(b) Modeling bounded count data(c) Modeling contingency tables(d) All of the mentionedThe question was asked during an online exam.I would like to ask this question from Common Distributions topic in division Statistical Inference and Regression Models of Data Science |
|
Answer» The CORRECT CHOICE is (B) Modeling bounded count DATA |
|
| 49. |
Which of the following inequality is useful for interpreting variances?(a) Chebyshev(b) Stautaory(c) Testory(d) All of the mentionedI had been asked this question by my school teacher while I was bunking the class.Asked question is from Probability and Statistics in division Statistical Inference and Regression Models of Data Science |
|
Answer» CORRECT option is (a) Chebyshev The BEST explanation: Chebyshev’s inequality is also SPELLED as Tchebysheff’s inequality. |
|
| 50. |
Bayesian inference uses frequency interpretations of probabilities to control error rates.(a) True(b) FalseI had been asked this question in an online interview.This intriguing question originated from Introduction to Statistical Inference in section Statistical Inference and Regression Models of Data Science |
|
Answer» Correct choice is (b) False |
|