Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

The ________ project builds on top of pandas and matplotlib to provide easy plotting of data.(a) yhat(b) Seaborn(c) Vincent(d) None of the mentionedI have been asked this question in a national level competition.This question is from Pandas in division Data Analysis with Python of Data Science

Answer»

The correct ANSWER is (b) Seaborn

To EXPLAIN I would say: Seaborn has great support for PANDAS data OBJECTS.

2.

Spyder can introspect and display Pandas DataFrames.(a) True(b) FalseI had been asked this question in final exam.My enquiry is from Pandas topic in section Data Analysis with Python of Data Science

Answer» CORRECT CHOICE is (B) False

Easiest explanation - Spyder SHOW both “COLUMN wise min/max and global min/max coloring.
3.

Which of the following makes use of pandas and returns data in a series or dataFrame?(a) pandaSDMX(b) freedapi(c) OutPy(d) none of the mentionedThe question was posed to me during a job interview.Question is taken from Pandas topic in division Data Analysis with Python of Data Science

Answer»

Correct ANSWER is (b) freedapi

The best EXPLANATION: freedapi module requires a FRED API key that you can OBTAIN for FREE on the FRED website.

4.

Which of the following is a foundational exploratory visualization package for the R language in pandas ecosystem?(a) yhat(b) Seaborn(c) Vincent(d) None of the mentionedI have been asked this question in my homework.I'm obligated to ask this question of Pandas topic in section Data Analysis with Python of Data Science

Answer»

Right answer is (a) yhat

To EXPLAIN: It has GREAT SUPPORT for pandas DATA objects.

5.

Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different DataFrame objects?(a) corrwith(b) corwith(c) corwit(d) none of the mentionedThe question was posed to me in an interview for job.This intriguing question comes from Computational tools in section Data Analysis with Python of Data Science

Answer»

Correct CHOICE is (a) corrwith

The EXPLANATION: A score CLOSE to 1 MEANS their tastes are very similar.

6.

You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.(a) sca_matrix(b) scatter_matrix(c) DataFrame.plot(d) all of the mentionedI got this question in exam.The query is from Plotting in Python in section Data Analysis with Python of Data Science

Answer»

The correct OPTION is (B) scatter_matrix

The EXPLANATION: You can create DENSITY plots using the Series/DataFrame.plot.

7.

Point out the correct statement.(a) All of the standard pandas data structures have a to_sparse method(b) Any sparse object can be converted back to the standard dense form by calling to_dense(c) The sparse objects exist for memory efficiency reasons(d) All of the mentionedI had been asked this question in examination.My doubt stems from Pandas topic in chapter Data Analysis with Python of Data Science

Answer»

Correct CHOICE is (d) All of the mentioned

Easy explanation - The to_sparse METHOD TAKES a KIND argument and a fill_value.

8.

Which of the following plots are used to check if a data set or time series is random?(a) Lag(b) Random(c) Lead(d) None of the mentionedThe question was posed to me in examination.I need to ask this question from Plotting in Python topic in portion Data Analysis with Python of Data Science

Answer» CORRECT ANSWER is (a) LAG

Best explanation: Random DATA should not EXHIBIT any structure in the lag plot.
9.

Which of the following statement will import pandas?(a) import pandas as pd(b) import panda as py(c) import pandaspy as pd(d) all of the mentionedThis question was posed to me in a job interview.This question is from Pandas in section Data Analysis with Python of Data Science

Answer» CORRECT choice is (a) import pandas as pd

Best explanation: You can read data from a CSV FILE using the read_csv FUNCTION.
10.

Which of the following works analogously to the form of the dict constructor?(a) DataFrame.from_items(b) DataFrame.from_records(c) DataFrame.from_dict(d) All of the mentionedThe question was asked in quiz.This intriguing question comes from Pandas Data Structure in division Data Analysis with Python of Data Science

Answer»

Right ANSWER is (a) DataFrame.from_items

Easy EXPLANATION - DataFrame.from_records TAKES a list of tuples or an ndarray with structured dtype.

11.

Which of the following is used to compute the percent change over a given number of periods?(a) pct_change(b) percent_change(c) per_change(d) none of the mentionedThis question was posed to me in an interview.Question is taken from Computational tools topic in chapter Data Analysis with Python of Data Science

Answer» RIGHT answer is (a) pct_change

Easy explanation - Series, DataFrame, and PANEL all have a METHOD pct_change.
12.

Which of the following value is provided by kind keyword for barplot?(a) bar(b) kde(c) hexbin(d) none of the mentionedThe question was asked in unit test.The query is from Plotting in Python topic in division Data Analysis with Python of Data Science

Answer»

Correct choice is (a) bar

To EXPLAIN I WOULD SAY: bar can also be USED for BARPLOT.

13.

The plot method on Series and DataFrame is just a simple wrapper around ____________(a) gplt.plot()(b) plt.plot()(c) plt.plotgraph()(d) none of the mentionedI got this question in a national level competition.My question is taken from Plotting in Python topic in portion Data Analysis with Python of Data Science

Answer»

The correct OPTION is (B) plt.plot()

To EXPLAIN I would say: If the INDEX consists of dates, it calls gcf().autofmt_xdate() to try to format the x-axis nicely.

14.

Which of the following scalars can be converted to other ‘frequencies’ by as typing to a specific timedelta type?(a) Timedelta Series(b) TimedeltaIndex(c) Timedelta(d) All of the mentionedThis question was addressed to me in exam.This intriguing question comes from Time Deltas in section Data Analysis with Python of Data Science

Answer»

The CORRECT option is (d) All of the mentioned

For EXPLANATION: These operations yield Series and PROPAGATE NAT -> nan.

15.

Which of the following operations are supported on Time Frames?(a) idxmax(b) ixmax(c) ixmin(d) none of the mentionedThis question was addressed to me during an interview for a job.My question comes from Time Deltas in section Data Analysis with Python of Data Science

Answer» RIGHT option is (a) idxmax

Easiest explanation - Operands can ALSO appear in a REVERSED ORDER.
16.

Which of the following operation works with the same syntax as the analogous dict operations?(a) Getting columns(b) Setting columns(c) Deleting columns(d) All of the mentionedI have been asked this question in exam.I'd like to ask this question from Pandas Data Structure in portion Data Analysis with Python of Data Science

Answer»

Correct option is (d) All of the mentioned

The best explanation: You can treat a DATAFRAME semantically LIKE a DICT of like-indexed SERIES objects.

17.

Which of the following specifies the required minimum number of observations for each column pair in order to have a valid result?(a) min_periods(b) max_periods(c) minimum_periods(d) all of the mentionedThe question was asked in an interview for job.This interesting question is from Computational tools in chapter Data Analysis with Python of Data Science

Answer»

The CORRECT OPTION is (a) min_periods

Explanation: DataFrame.cov ALSO SUPPORTS an OPTIONAL min_periods.

18.

Combination of TimedeltaIndex with DatetimeIndex allow certain combination operations that are NaT preserving.(a) True(b) FalseI have been asked this question by my college director while I was bunking the class.My question comes from Time Deltas topic in division Data Analysis with Python of Data Science

Answer» CORRECT option is (a) True

To explain I WOULD SAY: You can ALSO convert indices to yield another INDEX.
19.

Point out the wrong statement.(a) min, max, idxmin, idxmax operations are supported on Series(b) You cannot pass a timedelta to get a particular value(c) Division by the numpy scalar is true division(d) None of the mentionedI had been asked this question by my college professor while I was bunking the class.This key question is from Time Deltas topic in chapter Data Analysis with Python of Data Science

Answer»

Right choice is (b) You cannot pass a timedelta to get a particular value

The BEST EXPLANATION: Dividing or multiplying a timedelta64[ns] SERIES by an INTEGER or integer Series YIELDS another timedelta64[ns] dtypes Series.

20.

Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix?(a) SparseSeries.to_coo()(b) Series.to_coo()(c) SparseSeries.to_cooser()(d) None of the mentionedThe question was posed to me in class test.This intriguing question comes from Pandas topic in section Data Analysis with Python of Data Science

Answer»

The correct option is (a) SparseSeries.to_coo()

EXPLANATION: EXPERIMENTAL api to TRANSFORM between sparse pandas and scipy.sparse structures.

21.

Point out the correct statement.(a) Pandas consist of set of labeled array data structures(b) Pandas consist of an integrated group by engine for aggregating and transforming data sets(c) Pandas consist of moving window statistics(d) All of the mentionedThis question was addressed to me by my college professor while I was bunking the class.My question is taken from Pandas in portion Data Analysis with Python of Data Science

Answer»

Correct choice is (d) All of the mentioned

Best explanation: Some ELEMENTS MAY be close to ONE another according to one distance and FARTHER AWAY according to another.

22.

Which of the following method produces a data ranking with ties being assigned the mean of the ranks for the group?(a) rank(b) dense_rank(c) partition_rank(d) none of the mentionedI have been asked this question in examination.Enquiry is from Computational tools topic in section Data Analysis with Python of Data Science

Answer»

Correct answer is (a) rank

For EXPLANATION: rank is ALSO a DATAFRAME method.

23.

Which of the following plots are often used for checking randomness in time series?(a) Autocausation(b) Autorank(c) Autocorrelation(d) None of the mentionedThis question was posed to me by my college director while I was bunking the class.Question is from Plotting in Python in section Data Analysis with Python of Data Science

Answer»

The correct CHOICE is (c) Autocorrelation

Easiest explanation - If the TIME series is RANDOM, such autocorrelations should be NEAR zero for any and all time-lag SEPARATIONS.

24.

Which of the following is the base layer for all of the sparse indexed data structures?(a) SArray(b) SparseArray(c) PyArray(d) None of the mentionedI had been asked this question in exam.This key question is from Pandas in chapter Data Analysis with Python of Data Science

Answer»

Correct answer is (B) SparseArray

The explanation is: SparseArray is a 1-dimensional ndarray-like OBJECT STORING only VALUES distinct from the fill_value.

25.

Point out the wrong statement.(a) qgrid is an interactive grid for sorting and filtering DataFrames(b) Pandas DataFrames implement _repr_html_ methods which are utilized by IPython Notebook(c) Spyder is a cross-platform Qt-based open-source R IDE(d) None of the mentionedThis question was addressed to me during a job interview.This is a very interesting question from Pandas topic in section Data Analysis with Python of Data Science

Answer»

Correct choice is (C) Spyder is a cross-platform Qt-based open-source R IDE

The EXPLANATION is: Spyder is a cross-platform Qt-based open-source PYTHON IDE.

26.

Point out the wrong statement.(a) Series is 1D labeled homogeneously-typed array(b) DataFrame is general 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed columns(c) Panel is generally 2D labeled, also size-mutable array(d) None of the mentionedI had been asked this question in final exam.This interesting question is from Pandas in chapter Data Analysis with Python of Data Science

Answer»

Correct CHOICE is (c) Panel is generally 2D LABELED, ALSO size-mutable array

To EXPLAIN I would say: Panel is generally 3D labeled.

27.

All pandas data structures are ___ mutable but not always _______mutable.(a) size, value(b) semantic, size(c) value, size(d) none of the mentionedI got this question in my homework.I'd like to ask this question from Pandas in division Data Analysis with Python of Data Science

Answer»

The correct option is (C) VALUE, size

For EXPLANATION: The length of a Series cannot be CHANGED.

28.

Point out the wrong statement.(a) lxml is very fast(b) lxml requires Cython to install correctly(c) lxml does not make any guarantees about the results of it’s parse(d) none of the mentionedI have been asked this question in my homework.This intriguing question originated from Computational tools in division Data Analysis with Python of Data Science

Answer» RIGHT option is (c) LXML does not make any guarantees about the results of it’s parse

For explanation: There are some versioning issues surrounding the libraries that are used to parse HTML tables in the top-level pandas io function read_html.
29.

Point out the correct combination with regards to kind keyword for graph plotting.(a) ‘hist’ for histogram(b) ‘box’ for boxplot(c) ‘area’ for area plots(d) all of the mentionedThe question was posed to me by my school principal while I was bunking the class.The query is from Plotting in Python in section Data Analysis with Python of Data Science

Answer» RIGHT CHOICE is (d) all of the mentioned

The explanation is: The kind keyword ARGUMENT of plot() accepts a handful of values for plots other than the default Line plot.
30.

Which of the following is used for testing for membership in the list of column names?(a) in(b) out(c) elseif(d) none of the mentionedThis question was addressed to me in examination.Asked question is from Pandas topic in portion Data Analysis with Python of Data Science

Answer»

Correct ANSWER is (a) in

Easy EXPLANATION - For DataFrames, LIKEWISE, in applies to the COLUMN AXIS.

31.

Which of the following is prominent python “statistics and econometrics library”?(a) Bokeh(b) Seaborn(c) Statsmodels(d) None of the mentionedThis question was addressed to me in final exam.The question is from Pandas in section Data Analysis with Python of Data Science

Answer»

The correct OPTION is (c) Statsmodels

Explanation: Bokeh is a Python interactive VISUALIZATION library for LARGE datasets that natively USES the latest web TECHNOLOGIES.

32.

The result of an operation between unaligned Series will have the ________ of the indexes involved.(a) intersection(b) union(c) total(d) all of the mentionedThis question was posed to me in examination.I'm obligated to ask this question of Pandas Data Structure in division Data Analysis with Python of Data Science

Answer»

The correct CHOICE is (B) union

To explain: If a LABEL is not FOUND in one Series or the other, the result will be marked as missing NaN.

33.

Point out the wrong combination with regards to kind keyword for graph plotting.(a) ‘scatter’ for scatter plots(b) ‘kde’ for hexagonal bin plots(c) ‘pie’ for pie plots(d) none of the mentionedThe question was asked in examination.Query is from Plotting in Python in division Data Analysis with Python of Data Science

Answer»

Right answer is (B) ‘KDE’ for HEXAGONAL BIN plots

Explanation: kde is USED for density plots.

34.

Which of the following method can be used to rename categorical data?(a) Categorical.rename_categories()(b) Categorical.rename()(c) Categorical.mv_categories()(d) None of the mentionedI had been asked this question in a job interview.My question is based upon Time Deltas in section Data Analysis with Python of Data Science

Answer»

Correct CHOICE is (a) Categorical.rename_categories()

BEST explanation: Renaming categories is DONE by assigning NEW values to the Series.cat.categories property.

35.

Pandas consist of static and moving window linear and panel regression.(a) True(b) FalseI have been asked this question by my college director while I was bunking the class.Query is from Pandas topic in chapter Data Analysis with Python of Data Science

Answer» CORRECT choice is (a) True

The explanation: Time SERIES and cross-sectional data are SPECIAL CASES of PANEL data.
36.

rolling_count function gives the number of non-null observations.(a) True(b) FalseThe question was posed to me during an interview.My doubt stems from Computational tools topic in chapter Data Analysis with Python of Data Science

Answer»

Right ANSWER is (b) False

Explanation: The BINARY operators TAKE two Series or DATAFRAMES.

37.

Which of the following is used for machine learning in python?(a) scikit-learn(b) seaborn-learn(c) stats-learn(d) none of the mentionedI got this question during an internship interview.My doubt is from Pandas in section Data Analysis with Python of Data Science

Answer» RIGHT option is (a) scikit-learn

Easy EXPLANATION - scikit-learn is built on NumPy, SCIPY, and MATPLOTLIB.
38.

Which of the following library is used to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1?(a) pandaSDMX(b) freedapi(c) geopandas(d) all of the mentionedThis question was addressed to me in class test.My question is from Pandas topic in section Data Analysis with Python of Data Science

Answer»

Right choice is (a) pandaSDMX

To explain: Geopandas extends PANDAS DATA OBJECTS to include geographic INFORMATION which supports geometric OPERATIONS.

39.

Which of the following object you get after reading CSV file?(a) DataFrame(b) Character Vector(c) Panel(d) All of the mentionedThis question was posed to me in an online quiz.My enquiry is from Pandas topic in division Data Analysis with Python of Data Science

Answer» RIGHT choice is (a) DATAFRAME

For EXPLANATION: You get COLUMNS out of a DataFrame the same WAY you get elements out of a dictionary.
40.

Which of the following input can be accepted by DataFrame?(a) Structured ndarray(b) Series(c) DataFrame(d) All of the mentionedThis question was addressed to me in an interview.I'm obligated to ask this question of Pandas Data Structure in section Data Analysis with Python of Data Science

Answer»

The CORRECT option is (d) All of the mentioned

To explain I WOULD say: DataFrame is a 2-dimensional labeled data STRUCTURE with COLUMNS of POTENTIALLY different types.

41.

Which of the following thing can be data in Pandas?(a) a python dict(b) an ndarray(c) a scalar value(d) all of the mentionedThis question was addressed to me in quiz.The above asked question is from Pandas Data Structure in portion Data Analysis with Python of Data Science

Answer»

Correct choice is (d) all of the mentioned

The BEST I can explain: The PASSED INDEX is a list of AXIS labels.

42.

Point out the correct statement.(a) Pandas represents timestamps in microsecond resolution(b) Pandas is 100% thread safe(c) For Series and DataFrame objects, var normalizes by N-1 to produce unbiased estimates(d) All of the mentionedThis question was posed to me in an online interview.I want to ask this question from Computational tools topic in chapter Data Analysis with Python of Data Science

Answer» CORRECT choice is (c) For Series and DataFrame objects, var NORMALIZES by N-1 to produce UNBIASED estimates

The explanation is: Pandas represents timestamps in NANOSECOND resolution.
43.

Which of the following is used to generate an index with time delta?(a) TimeIndex(b) TimedeltaIndex(c) LeadIndex(d) None of the mentionedI had been asked this question during an interview.Enquiry is from Time Deltas in section Data Analysis with Python of Data Science

Answer»

Correct answer is (B) TIMEDELTAINDEX

To explain I would say: USING TimedeltaIndex you can pass string-like, TIMEDELTA, timedelta, or np.timedelta64 objects.

44.

Which of the following list-like data structure is used for managing a dynamic collection of SparseArrays?(a) SparseList(b) GeoList(c) SparseSeries(d) All of the mentionedI got this question in a job interview.This is a very interesting question from Pandas in section Data Analysis with Python of Data Science

Answer» CORRECT answer is (a) SparseList

The best I can explain: To CREATE one, SIMPLY call the SparseList constructor with a fill_value.
45.

Point out the correct statement.(a) Statsmodels provides powerful statistics, econometrics, analysis and modeling functionality that is out of panda’s scope(b) Vintage leverages pandas objects as the underlying data container for computation(c) Bokeh is a Python interactive visualization library for small datasets(d) All of the mentionedThe question was posed to me in quiz.My question is taken from Pandas in division Data Analysis with Python of Data Science

Answer»

Correct option is (a) STATSMODELS provides powerful STATISTICS, econometrics, analysis and modeling functionality that is out of PANDA’s scope

Easy explanation - Bokeh goal is to provide elegant, concise construction of NOVEL GRAPHICS in the style of D3.

46.

Panel is a container for Series, and DataFrame is a container for dataFrame objects.(a) True(b) FalseI got this question in examination.The query is from Pandas in section Data Analysis with Python of Data Science

Answer»

Right OPTION is (b) False

The best EXPLANATION: DataFrame is a container for Series, and PANEL is a container for dataFrame OBJECTS.

47.

Point out the correct statement.(a) If data is a list, if index is passed the values in data corresponding to the labels in the index will be pulled out(b) NaN is the standard missing data marker used in pandas(c) Series acts very similarly to a array(d) None of the mentionedI have been asked this question in class test.My question comes from Pandas Data Structure topic in chapter Data Analysis with Python of Data Science

Answer»

The CORRECT option is (B) NaN is the standard missing data marker used in pandas

To EXPLAIN I would say: If data is a dict, if index is passed the values in data CORRESPONDING to the labels in the index will be pulled out.

48.

Point out the correct statement.(a) Timedeltas are differences in times, expressed in difference units(b) You can construct a Timedelta scalar through various argument(c) DateOffsets cannot be used in construction(d) All of the mentionedThe question was asked in an interview for job.Query is from Time Deltas topic in division Data Analysis with Python of Data Science

Answer» CORRECT option is (a) TIMEDELTAS are differences in TIMES, expressed in difference units

For explanation: Timedeltas can be both POSITIVE and NEGATIVE.
49.

Which of the following indexing capabilities is used as a concise means of selecting data from a pandas object?(a) In(b) ix(c) ipy(d) none of the mentionedI had been asked this question in an interview for internship.Enquiry is from Pandas in portion Data Analysis with Python of Data Science

Answer»

Correct choice is (B) ix

To EXPLAIN: ix and REINDEX are 100% EQUIVALENT.

50.

Quandl API for Python wraps the ________ REST API to return Pandas DataFrames with time series indexes.(a) Quandl(b) PyDatastream(c) PyData(d) None of the mentionedI have been asked this question in quiz.I need to ask this question from Pandas topic in section Data Analysis with Python of Data Science

Answer»

Correct CHOICE is (a) Quandl

The explanation: PyDatastream is a Python INTERFACE to the Thomson Dataworks Enterprise (DWE/Datastream) SOAP API to return indexed pandas dataFrames or panels with FINANCIAL DATA.