Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

Which of the following is a goal of literate statistical programming?(a) Combine explanatory text and data analysis code in a single document(b) Ensure that data analysis documents are always exported in JPEG format(c) Require those data analysis summaries are always written in R(d) None of the mentionedI had been asked this question in unit test.This interesting question is from Literate Statistical Programming topic in division Data Analysis and Research of Data Science

Answer»

The correct option is (a) Combine explanatory text and data analysis code in a single document

The best EXPLANATION: LITERATE Statistical Practice is a PROGRAMMING methodology.

2.

Point out the wrong statement.(a) File devices are useful for creating plots that can be included in other documents or sent to other people(b) Plots must be created on a graphics device(c) For file devices, there are vector and bitmap formats(d) None of the mentionedI had been asked this question by my college professor while I was bunking the class.Question is from Graphics Devices in portion Data Analysis and Research of Data Science

Answer»

Right choice is (d) None of the mentioned

The best I can EXPLAIN: For file devices, there are vector and BITMAP FORMATS.

3.

Which of the following is required to implement a literate programming system?(a) A programming language like Perl(b) A programming language like Java(c) A programming language like R(d) All of the mentionedThe question was asked in quiz.Question is from Literate Statistical Programming in chapter Data Analysis and Research of Data Science

Answer»

Right choice is (C) A programming LANGUAGE like R

Explanation: R is a language and environment for statistical COMPUTING and GRAPHICS.

4.

Which of the following is the correct order of conversion?(a) .md->.Rmd->.html(b) .Rmd->.md->.html(c) .Rmd->.md->.xml(d) all of the mentionedI got this question by my school principal while I was bunking the class.I would like to ask this question from knitr in portion Data Analysis and Research of Data Science

Answer» RIGHT option is (a) .md->.Rmd->.html

To explain: knitr converts markdown document in to html by DEFAULT.
5.

Which of the following is required for not echoing the code?(a) echo=TRUE(b) print=TRUE(c) echo=FALSE(d) all of the mentionedI got this question during an online exam.This intriguing question comes from knitr in division Data Analysis and Research of Data Science

Answer»

The correct ANSWER is (a) echo=TRUE

The BEST I can EXPLAIN: Code has to be WRITTEN to set the GLOBAL options.

6.

Which of the following gives reviewers an important tool without dramatically increasing the burden?(a) Quality research(b) Replication research(c) Reproducible research(d) None of the mentionedI got this question during a job interview.Question is taken from Introduction to Reproducible Research topic in chapter Data Analysis and Research of Data Science

Answer»

The correct option is (c) Reproducible research

The EXPLANATION: Reproducible research is IMPORTANT, but does not NECESSARILY solve the critical question of WHETHER a data ANALYSIS is trustworthy.

7.

Point out the correct statement.(a) The choice of an appropriate metric will influence the shape of the clusters(b) Hierarchical clustering is also called HCA(c) In general, the merges and splits are determined in a greedy manner(d) All of the mentionedI have been asked this question by my college director while I was bunking the class.I need to ask this question from Clustering in division Data Analysis and Research of Data Science

Answer» RIGHT ANSWER is (d) All of the mentioned

Explanation: Some elements MAY be close to one another according to one distance and farther AWAY according to another.
8.

Which of the following block information is odd man out in the below figure?(a) Scatterplots(b) 5 number summary(c) 2D Graph(d) None of the mentionedThis question was addressed to me in exam.Enquiry is from Plotting Systems topic in portion Data Analysis and Research of Data Science

Answer» RIGHT answer is (b) 5 number SUMMARY

Explanation: 5 number summary is ONE dimensional GRAPH.
9.

Some chunks have to be re-computed every time you re-knit the file.(a) True(b) FalseI had been asked this question in examination.My question is taken from Literate Statistical Programming in chapter Data Analysis and Research of Data Science

Answer»

Correct CHOICE is (b) False

For explanation: All CHUNKS have to be re-computed EVERY TIME you re-knit the file.

10.

Which of the following graphs has properties in the below figure?(a) Exploratory(b) Inferential(c) Causal(d) None of the mentionedThis question was posed to me by my college director while I was bunking the class.I'd like to ask this question from Exploratory Graphs in division Data Analysis and Research of Data Science

Answer»

Correct answer is (a) Exploratory

For explanation: Making PLOTS of the DATA reveals various interesting FEATURES.

11.

Which of the following function is used for k-means clustering?(a) k-means(b) k-mean(c) heatmap(d) none of the mentionedThis question was addressed to me in exam.The origin of the question is Clustering topic in portion Data Analysis and Research of Data Science

Answer» RIGHT CHOICE is (a) k-means

To explain: K-means REQUIRES a NUMBER of clusters.
12.

Which of the following function displays currently active graphics device?(a) dev.present(b) dev.cur(c) pre.cur(d) all of the mentionedThis question was posed to me during an interview for a job.My enquiry is from Graphics Devices in portion Data Analysis and Research of Data Science

Answer» CORRECT option is (b) dev.cur

Easiest explanation - You can change the active GRAPHICS device with dev.set.
13.

Which of the following way is required to make work reproducible?(a) keep track of things(b) Save output(c) Save data in proprietary formats(d) None of the mentionedThe question was asked in examination.My query is from Literate Statistical Programming in division Data Analysis and Research of Data Science

Answer»

Correct answer is (a) KEEP track of things

Explanation: Save DATA in NON proprietary formats to make WORK REPRODUCIBLE.

14.

Which of the following tool can be used for integrating text and code in one document?(a) knitr(b) ggplot2(c) NumPy(d) None of the mentionedThis question was addressed to me in exam.Question is taken from Literate Statistical Programming topic in chapter Data Analysis and Research of Data Science

Answer» RIGHT OPTION is (a) knitr

Easy explanation - knitr is a WAY to write LaTeX, HTML, and Markdown with R CODE interlaced.
15.

Which of the following is also referred to as overlayed 1D plot?(a) lattice(b) barplot(c) gplot(d) all of the mentionedI have been asked this question at a job interview.Question is taken from Exploratory Graphs in section Data Analysis and Research of Data Science

Answer»

Correct answer is (a) LATTICE

Easiest EXPLANATION - lattice is an add-on package that implements TRELLIS GRAPHICS.

16.

Which of the following gave rise to need of graphs in data analysis?(a) Data visualization(b) Communicating results(c) Decision making(d) All of the mentionedI had been asked this question in an international level competition.My enquiry is from Exploratory Graphs topic in portion Data Analysis and Research of Data Science

Answer»

Correct answer is (d) All of the mentioned

For EXPLANATION: A picture can TELL better STORY than DATA.

17.

Point out the wrong statement.(a) k-means clustering is a method of vector quantization(b) k-means clustering aims to partition n observations into k clusters(c) k-nearest neighbor is same as k-means(d) none of the mentionedThe question was posed to me in final exam.Asked question is from Clustering in portion Data Analysis and Research of Data Science

Answer» CORRECT OPTION is (C) k-nearest NEIGHBOR is same as k-means

Explanation: k-nearest neighbor has NOTHING to do with k-means.
18.

Which of the following argument specifies margin size with regards to par function?(a) las(b) bg(c) mar(d) all of the mentionedThe question was posed to me by my college director while I was bunking the class.The query is from Plotting Systems topic in division Data Analysis and Research of Data Science

Answer»

The correct CHOICE is (c) mar

The EXPLANATION: par function is used to SPECIFY GLOBAL parameters.

19.

Which of the following disadvantage does literate programming have?(a) Slow processing of documents(b) Code is not automatic(c) No logical order(d) All of the mentionedI had been asked this question in an interview.Query is from Literate Statistical Programming in division Data Analysis and Research of Data Science

Answer» RIGHT option is (a) SLOW PROCESSING of documents

The explanation: CODE and text is in ONE place.
20.

Which of the following is effective way of checking validity of data analysis?(a) Re-run the analysis(b) Review the code(c) Check the sensitivity(d) All of the mentionedI got this question in exam.The query is from Introduction to Reproducible Research in portion Data Analysis and Research of Data Science

Answer» RIGHT ANSWER is (d) All of the mentioned

Explanation: Reproducibility ADDRESSES the most “downstream” aspect of the research PROCESS.
21.

Which of the following dimension type graph is shown in the below figure?(a) one-dimensional(b) two-dimensional(c) three-dimensional(d) none of the mentionedI have been asked this question in semester exam.Question is taken from Exploratory Graphs topic in chapter Data Analysis and Research of Data Science

Answer»

Correct option is (B) two-dimensional

The EXPLANATION: A two-dimensional graph is a SET of points in two-dimensional SPACE.

22.

Which of the following combination is incorrect?(a) Continuous – euclidean distance(b) Continuous – correlation similarity(c) Binary – manhattan distance(d) None of the mentionedThe question was asked in class test.I would like to ask this question from Clustering in division Data Analysis and Research of Data Science

Answer» CORRECT answer is (d) None of the mentioned

Explanation: You should CHOOSE a distance/similarity that MAKES SENSE for your problem.
23.

Which of the following tool documentation language is supported by knitr?(a) RMarkdown(b) LaTeX(c) HTML(d) None of the mentionedThis question was posed to me during an interview.I'd like to ask this question from Literate Statistical Programming in portion Data Analysis and Research of Data Science

Answer» RIGHT CHOICE is (a) RMarkdown

Explanation: KNITR is AVAILABLE on CRAN.
24.

Point out the wrong statement with respect to reproducibility.(a) Focuses on the validity of the data analysis(b) The ultimate standard for strengthening scientific evidence(c) Important when replication is impossible(d) None of the mentionedThe question was posed to me in class test.This is a very interesting question from Introduction to Reproducible Research topic in division Data Analysis and Research of Data Science

Answer»

Right option is (b) The ultimate STANDARD for strengthening SCIENTIFIC evidence

To EXPLAIN: Replication is particularly important in studies that can impact broad POLICY or regulatory decisions.

25.

Point out the correct statement.(a) coplots are one dimensional data graph(b) Exploratory graphs are made quickly(c) Exploratory graphs are made relatively less in number(d) All of the mentionedThe question was posed to me during an online exam.Query is from Exploratory Graphs topic in section Data Analysis and Research of Data Science

Answer»

Right ANSWER is (a) coplots are one dimensional data graph

To explain I WOULD say: coplot is used for two dimensional REPRESENTATION.

26.

Hierarchical clustering should be primarily used for exploration.(a) True(b) FalseI had been asked this question during an interview.Query is from Clustering topic in portion Data Analysis and Research of Data Science

Answer» CORRECT choice is (a) True

The EXPLANATION is: HIERARCHICAL CLUSTERING is DETERMINISTIC.
27.

Which of the following is required by K-means clustering?(a) defined distance metric(b) number of clusters(c) initial guess as to cluster centroids(d) all of the mentionedThis question was addressed to me in an online interview.My doubt stems from Clustering in division Data Analysis and Research of Data Science

Answer»

The correct OPTION is (d) all of the mentioned

The EXPLANATION: K-means clustering FOLLOWS partitioning APPROACH.

28.

Which of the following is required to implement a literate programming system?(a) A programming language like Perl(b) A programming language like Java(c) A programming language like R(d) All of the mentionedI got this question in exam.My question is based upon Literate Statistical Programming in division Data Analysis and Research of Data Science

Answer»

Right choice is (C) A programming LANGUAGE like R

To explain: R is a language and environment for statistical COMPUTING and graphics.

29.

What does it mean to weave a literate statistical program?(a) Convert a program from S to python(b) Convert the program into a human readable document(c) Convert a program to decompress it(d) All of the mentionedThe question was asked in a job interview.Asked question is from Literate Statistical Programming topic in portion Data Analysis and Research of Data Science

Answer»

Correct option is (b) Convert the PROGRAM into a HUMAN READABLE document

For EXPLANATION: LITERATE Statistical Programming can be done with knitr.

30.

knitr is good for complex time-consuming computations.(a) True(b) FalseThe question was posed to me in an online quiz.I want to ask this question from knitr in section Data Analysis and Research of Data Science

Answer»

Correct CHOICE is (B) False

Easy explanation - KNITR is poor for COMPLEX time-consuming computations.

31.

The core plotting engine is encapsulated in graphics package.(a) True(b) FalseI had been asked this question in homework.Query is from Plotting Systems in chapter Data Analysis and Research of Data Science

Answer»

The correct CHOICE is (a) True

The BEST explanation: graphics package CONTAIN plotting FUNCTIONS.

32.

Which of the following parameter defines line type such as dashed and dotted?(a) lty(b) pch(c) lwd(d) all of the mentionedI had been asked this question during an interview.My doubt is from Plotting Systems topic in section Data Analysis and Research of Data Science

Answer»

The correct OPTION is (a) lty

Explanation: LWD is USED for line WIDTH.

33.

Which of the following is a vector file device?(a) png(b) svg(c) bmp(d) none of the mentionedThis question was posed to me during an interview.I would like to ask this question from Graphics Devices topic in portion Data Analysis and Research of Data Science

Answer»

The correct OPTION is (b) svg

The best I can EXPLAIN: svg STANDS for scalable vector GRAPHICS.

34.

Which of the following is used to change active graphic device?(a) dev.set(b) dev.int(c) dev.win(d) all of the mentionedThis question was posed to me in final exam.The query is from Graphics Devices in section Data Analysis and Research of Data Science

Answer»

Correct OPTION is (a) dev.set

To EXPLAIN: You can change the ACTIVE graphics device with dev.set() where is the NUMBER associated with the graphics device you want to SWITCH to.

35.

Which of the following will copy the plot from one device to another?(a) dev.copy(b) dev.copypdf(c) dev.device(d) all of the mentionedThe question was posed to me during an interview.This interesting question is from Graphics Devices topic in division Data Analysis and Research of Data Science

Answer»

Right answer is (a) dev.copy

The best I can explain: COPYING a PLOT to another device can be USEFUL because some plots REQUIRE a lot of code and it can be a pain to TYPE all that in again for a different device.

36.

Which of the following file format is graphic device only for windows?(a) pdf(b) svg(c) win.metafile(d) all of the mentionedI had been asked this question in my homework.I need to ask this question from Graphics Devices topic in portion Data Analysis and Research of Data Science

Answer»

The CORRECT CHOICE is (c) win.metafile

Explanation: EXPORTING graphics to a WINDOWS MetaFile can be ACHIEVED via the win.metafile.

37.

The document produced by knitr document has which of the following extension?(a) .md(b) .rmd(c) .html(d) none of the mentionedI have been asked this question in an internship interview.My question comes from knitr topic in division Data Analysis and Research of Data Science

Answer»

The CORRECT ANSWER is (b) .rmd

The explanation: knitr PRODUCES MARKDOWN document.

38.

Which of the following is similar to a pre-specified clinical trial protocol?(a) Caching-based Data Analysis(b) Evidence-based Data Analysis(c) Markdown-based Data Analysis(d) All of the mentionedThe question was asked in an online quiz.Enquiry is from Introduction to Reproducible Research topic in section Data Analysis and Research of Data Science

Answer»

Correct ANSWER is (b) Evidence-based Data ANALYSIS

The EXPLANATION is: Evidence-based Data Analysis a DETERMINISTIC statistical MACHINE.

39.

Color and shape are used to add dimensions to graph data.(a) True(b) FalseThis question was posed to me during a job interview.Asked question is from Exploratory Graphs topic in chapter Data Analysis and Research of Data Science

Answer»

The correct CHOICE is (a) True

The BEST I can explain: GRAPHS are commonly used by print and electronic media.

40.

Which of the following is the second goal of PCA?(a) data compression(b) statistical analysis(c) data dredging(d) all of the mentionedThis question was posed to me in class test.Enquiry is from Graphics Devices topic in division Data Analysis and Research of Data Science

Answer»

Correct option is (a) data compression

For explanation: The principal components are EQUAL to the RIGHT singular values if you FIRST scale the variables.

41.

Point out the correct statement.(a) Vector formats are good for line drawings and plots with solid colors using a modest number of points(b) Vector formats are good for plots with a large number of points, natural scenes or web based plots(c) The default graphics device is always the screen device(d) All of the mentionedI got this question in an online quiz.I'd like to ask this question from Graphics Devices topic in section Data Analysis and Research of Data Science

Answer»

Correct answer is (a) VECTOR formats are good for line drawings and PLOTS with solid colors using a modest number of points

To explain I would SAY: Bitmap formats are good for plots with a LARGE number of points, natural scenes or web based plots.

42.

Which of the following is a bitmap file type?(a) tiff(b) svg(c) pdf(d) none of the mentionedThe question was posed to me in unit test.The origin of the question is Graphics Devices in chapter Data Analysis and Research of Data Science

Answer»

The CORRECT CHOICE is (c) pdf

The explanation: TIFF is a COMPUTER file FORMAT for storing raster GRAPHICS images.

43.

What is the role of processing code in the research pipeline?(a) Transforms the analytical results into figures and tables(b) Transforms the analytic data into measured data(c) Transforms the measured data into analytic data(d) All of the mentionedThis question was addressed to me in quiz.My question is taken from Literate Statistical Programming topic in chapter Data Analysis and Research of Data Science

Answer»

The CORRECT option is (C) Transforms the measured data into analytic data

Easy explanation - Data science WORKFLOW is a non-linear, ITERATIVE process.

44.

Which of the following global option has value “hide”?(a) results(b) fig.width(c) echo(d) none of the mentionedThis question was addressed to me during an interview for a job.I'm obligated to ask this question of knitr in section Data Analysis and Research of Data Science

Answer»

Correct choice is (a) results

For explanation: Workflow R Markdown is a FORMAT for WRITING REPRODUCIBLE, dynamic reports with R.

45.

Point out the correct combination related to output statements.(a) results: “asis”(b) echo: true(c) echo=false(d) none of the mentionedThis question was posed to me during an online interview.Origin of the question is knitr in division Data Analysis and Research of Data Science

Answer»

Correct CHOICE is (a) results: “asis”

The best explanation: GLOBAL OPTION relating to echo have values TRUE and FALSE.

46.

Which of the following annotation function is used to add or modify text?(a) word(b) graph(c) lines(d) all of the mentionedThe question was posed to me in final exam.This intriguing question comes from Plotting Systems in portion Data Analysis and Research of Data Science

Answer»

The correct option is (d) all of the mentioned

To EXPLAIN I would say: POINTS and axis are other well known annotation function.

47.

Which of the following function has parameters shown in the below figure?(a) par(b) bar(c) base(d) all of the mentionedThe question was asked during an internship interview.My doubt stems from Graphics Devices topic in division Data Analysis and Research of Data Science

Answer»

Right option is (a) par

For EXPLANATION: R makes it EASY to combine multiple plots into one overall GRAPH, using either the par( ) or LAYOUT( ) function.

48.

Which of the the following graphic device information is odd man out in the below figure?(a) quartz(b) window(c) unix(d) x11The question was asked during an interview for a job.Question is taken from Graphics Devices topic in division Data Analysis and Research of Data Science

Answer»

The CORRECT answer is (c) UNIX

Easy EXPLANATION - unix keyword does not EXIST with regards to GRAPHICS device.

49.

Which of the following is suitable for knitr?(a) Reports(b) Data preprocessing documents(c) Technical manuals(d) All of the mentionedThe question was posed to me during an online interview.My question is based upon knitr topic in chapter Data Analysis and Research of Data Science

Answer»

Right OPTION is (a) Reports

Easiest EXPLANATION - knitr has SHORT TECHNICAL documents.

50.

Reproducibility determines correctness of data analysis.(a) True(b) FalseThe question was posed to me in an international level competition.My query is from Introduction to Reproducible Research topic in section Data Analysis and Research of Data Science

Answer»

The correct option is (b) False

The explanation: REPRODUCIBILITY has NOTHING to do with VALIDITY of DATA analysis.