Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

In comparison to SQL, Pig uses ______________(a) Lazy evaluation(b) ETL(c) Supports pipeline splits(d) All of the mentionedThis question was posed to me during an interview for a job.This interesting question is from Pig in Practice in portion Pig of Hadoop

Answer»

The CORRECT option is (d) All of the mentioned

For explanation I WOULD say: Pig Latin ability to include user code at any point in the PIPELINE is USEFUL for pipeline DEVELOPMENT.

2.

You can specify parameter names and parameter values in one of the ways?(a) As part of a command line.(b) In parameter file, as part of a command line(c) With the declare statement, as part of Pig script(d) All of the mentionedThis question was addressed to me during an interview for a job.Question is taken from Data Processing Operators in Pig topic in division Pig of Hadoop

Answer»

Right ANSWER is (d) All of the mentioned

To explain I would say: Parameter SUBSTITUTION MAY be used inside of MACROS.

3.

Which of the following file contains user defined functions (UDFs)?(a) script2-local.pig(b) pig.jar(c) tutorial.jar(d) excite.log.bz2I got this question during an online exam.This interesting question is from Data Processing Operators in Pig in division Pig of Hadoop

Answer»

The correct answer is (C) tutorial.jar

To EXPLAIN I WOULD say: tutorial.jar CONTAINS java classes also.

4.

Which of the following is correct syntax for parameter substitution using cmd?(a) pig {-param param_name = param_value | -param_file file_name} [-debug | -dryrun] script(b) {%declare | %default} param_name param_value(c) {%declare | %default} param_name param_value cmd(d) All of the mentionedThe question was posed to me in an online quiz.The above asked question is from Data Processing Operators in Pig topic in division Pig of Hadoop

Answer»

Right option is (a) pig {-param param_name = param_value | -param_file file_name} [-DEBUG | -dryrun] script

The explanation is: Parameter SUBSTITUTION is used to substitute VALUES for parameters at RUN time.

5.

Which of the following command can be used for debugging?(a) exec(b) execute(c) error(d) throwThe question was posed to me in semester exam.This question is from Data Processing Operators in Pig in section Pig of Hadoop

Answer»

Correct CHOICE is (a) EXEC

For EXPLANATION I would SAY: With the exec command, store STATEMENTS will not trigger execution; rather, the entire script is parsed before execution starts.

6.

Point out the wrong statement.(a) You can run Pig scripts from the command line and from the Grunt shell(b) DECLARE defines a Pig macro(c) Use Pig scripts to place Pig Latin statements and Pig commands in a single file(d) None of the mentionedThis question was posed to me at a job interview.Origin of the question is Data Processing Operators in Pig in section Pig of Hadoop

Answer» RIGHT choice is (B) DECLARE defines a Pig MACRO

The EXPLANATION is: DEFINE defines a Pig macro.
7.

Use the __________ command to run a Pig script that can interact with the Grunt shell (interactive mode).(a) fetch(b) declare(c) run(d) all of the mentionedI have been asked this question in an interview for internship.Question is from Data Processing Operators in Pig in chapter Pig of Hadoop

Answer» RIGHT choice is (c) run

Explanation: With the run COMMAND, every STORE TRIGGERS EXECUTION.
8.

Which of the following command is used to show values to keys used in Pig?(a) set(b) declare(c) display(d) all of the mentionedThis question was addressed to me in a job interview.My question comes from Data Processing Operators in Pig in section Pig of Hadoop

Answer»

Right answer is (a) SET

Explanation: All PIG and HADOOP PROPERTIES can be set, EITHER in the Pig script or via the Grunt command line.

9.

Point out the correct statement.(a) Invoke the Grunt shell using the “enter” command(b) Pig does not support jar files(c) Both the run and exec commands are useful for debugging because you can modify a Pig script in an editor(d) All of the mentionedThe question was posed to me in final exam.I need to ask this question from Data Processing Operators in Pig topic in section Pig of Hadoop

Answer»

Correct ANSWER is (c) Both the run and exec commands are useful for debugging because you can MODIFY a Pig script in an editor

Explanation: Both commands PROMOTE Pig script MODULARITY as they allow you to reuse existing components.

10.

__________method tells LoadFunc which fields are required in the Pig script.(a) pushProjection()(b) relativeToAbsolutePath()(c) prepareToRead()(d) none of the mentionedThe question was asked in an international level competition.This interesting question is from User-defined Functions in Pig topic in division Pig of Hadoop

Answer» RIGHT choice is (a) pushProjection()

To explain: Pig will USE the column index requiredField.index to COMMUNICATE with the LoadFunc about the fields REQUIRED by the Pig SCRIPT.
11.

Which of the following is shortcut for DUMP operator?(a) \de alias(b) \d alias(c) \q(d) None of the mentionedThe question was posed to me by my college professor while I was bunking the class.My doubt is from Data Processing Operators in Pig topic in division Pig of Hadoop

Answer» CORRECT option is (b) \d ALIAS

Explanation: If alias is IGNORED last defined alias will be USED.
12.

____________ method enables the RecordReader associated with the InputFormat provided by the LoadFunc is passed to the LoadFunc.(a) getNext()(b) relativeToAbsolutePath()(c) prepareToRead()(d) all of the mentionedI had been asked this question in an internship interview.The query is from User-defined Functions in Pig topic in division Pig of Hadoop

Answer»

The CORRECT choice is (c) prepareToRead()

The best EXPLANATION: The RECORDREADER can then be used by the implementation in getNext() to return a TUPLE REPRESENTING a record of data back to pig.

13.

The loader should use ______ method to communicate the load information to the underlying InputFormat.(a) relativeToAbsolutePath()(b) setUdfContextSignature()(c) getCacheFiles()(d) setLocation()I had been asked this question in an interview for job.Question is from User-defined Functions in Pig in portion Pig of Hadoop

Answer»

The correct choice is (d) setLocation()

The best I can explain: setLocation() method is CALLED by Pig to COMMUNICATE the load location to the LOADER.

14.

___________ return a list of hdfs files to ship to distributed cache.(a) relativeToAbsolutePath()(b) setUdfContextSignature()(c) getCacheFiles()(d) getShipFiles()This question was posed to me in quiz.I'm obligated to ask this question of User-defined Functions in Pig in division Pig of Hadoop

Answer»

Right answer is (d) getShipFiles()

To EXPLAIN I would SAY: The default IMPLEMENTATION provided in LoadFunc handles this for FileSystem LOCATIONS.

15.

____________ method will be called by Pig both in the front end and back end to pass a unique signature to the Loader.(a) relativeToAbsolutePath()(b) setUdfContextSignature()(c) getCacheFiles()(d) getShipFiles()This question was posed to me in quiz.The above asked question is from User-defined Functions in Pig in chapter Pig of Hadoop

Answer»

The correct choice is (b) setUdfContextSignature()

The explanation is: The signature can be used to store into the UDFContext any INFORMATION which the LOADER needs to store between various method INVOCATIONS in the front end and back end.

16.

Point out the wrong statement.(a) The load/store UDFs control how data goes into Pig and comes out of Pig.(b) LoadCaster has methods to convert byte arrays to specific types.(c) The meaning of getNext() has changed and is called by Pig runtime to get the last tuple in the data(d) None of the mentionedThe question was asked in my homework.My question comes from User-defined Functions in Pig in chapter Pig of Hadoop

Answer»

Right answer is (C) The meaning of getNext() has changed and is CALLED by PIG runtime to GET the last tuple in the DATA

Easy explanation: The meaning of getNext() has not changed and is called by Pig runtime to get the next tuple in the data.

17.

Which of the following has methods to deal with metadata?(a) LoadPushDown(b) LoadMetadata(c) LoadCaster(d) All of the mentionedThe question was asked during a job interview.My question comes from User-defined Functions in Pig in section Pig of Hadoop

Answer»

The correct option is (b) LoadMetadata

The explanation is: Most implementation of loaders don’t need to implement this UNLESS they interact with some metadata SYSTEM.

18.

Point out the correct statement.(a) LoadMeta has methods to convert byte arrays to specific types(b) The Pig load/store API is aligned with Hadoop InputFormat class only(c) LoadPush has methods to push operations from Pig runtime into loader implementations(d) All of the mentionedI had been asked this question during an interview for a job.The query is from User-defined Functions in Pig topic in section Pig of Hadoop

Answer»

Correct option is (c) LoadPush has methods to push operations from Pig runtime into loader implementations

For EXPLANATION I WOULD say: Currently only the pushProjection() method is CALLED by Pig to COMMUNICATE to the loader the exact FIELDS that are required in the Pig script.

19.

__________ abstract class has three main methods for loading data and for most use cases it would suffice to extend it.(a) Load(b) LoadFunc(c) FuncLoad(d) None of the mentionedThe question was asked during an interview.Question is from User-defined Functions in Pig in portion Pig of Hadoop

Answer»

The correct option is (b) LOADFUNC

To explain I would SAY: LoadFunc and StoreFunc IMPLEMENTATIONS should use the Hadoop 20 API BASED classes.

20.

Which of the following will compile the Pigunit?(a) $pig_trunk ant pigunit-jar(b) $pig_tr ant pigunit-jar(c) $pig_ ant pigunit-jar(d) None of the mentionedI have been asked this question during an interview.The query is from Pig Latin topic in division Pig of Hadoop

Answer» CORRECT ANSWER is (a) $pig_trunk ANT pigunit-jar

Easy EXPLANATION: The compile will create the pigunit.jar FILE.
21.

___________ is a simple xUnit framework that enables you to easily test your Pig scripts.(a) PigUnit(b) PigXUnit(c) PigUnitX(d) All of the mentionedThis question was posed to me by my college professor while I was bunking the class.I'd like to ask this question from Pig Latin in section Pig of Hadoop

Answer»

Correct answer is (B) PigXUnit

The BEST I can explain: With PigUnit you can PERFORM unit TESTING, regression testing, and rapid prototyping. No cluster setup is REQUIRED if you run Pig in local mode.

22.

The ________ class mimics the behavior of the Main class but gives users a statistics object back.(a) PigRun(b) PigRunner(c) RunnerPig(d) None of the mentionedThe question was asked during a job interview.Asked question is from Pig Latin in chapter Pig of Hadoop

Answer»

Right answer is (B) PigRunner

The explanation: Optionally, you can call the API with an IMPLEMENTATION of progress listener which will be invoked by PIG runtime during the EXECUTION.

23.

__________ is a framework for collecting and storing script-level statistics for Pig Latin.(a) Pig Stats(b) PStatistics(c) Pig Statistics(d) None of the mentionedI had been asked this question during an interview.I'm obligated to ask this question of Pig Latin topic in portion Pig of Hadoop

Answer» RIGHT option is (C) Pig STATISTICS

Explanation: The new Pig statistics and the existing Hadoop statistics can ALSO be accessed via the Hadoop job history FILE.
24.

Point out the wrong statement.(a) ILLUSTRATE operator is used to review how data is transformed through a sequence of Pig Latin statements(b) ILLUSTRATE is based on an example generator(c) Several new private classes make it harder for external tools such as Oozie to integrate with Pig statistics(d) None of the mentionedI got this question in quiz.This intriguing question originated from Pig Latin in portion Pig of Hadoop

Answer» RIGHT choice is (c) Several new PRIVATE classes make it harder for external tools such as Oozie to integrate with PIG statistics

For explanation: Several new PUBLIC classes make it easier for external tools such as Oozie to integrate with Pig statistics.
25.

___________ operator is used to view the step-by-step execution of a series of statements.(a) ILLUSTRATE(b) DESCRIBE(c) STORE(d) EXPLAINThe question was posed to me during an interview.I'd like to ask this question from Pig Latin in chapter Pig of Hadoop

Answer» RIGHT option is (a) ILLUSTRATE

The explanation: ILLUSTRATE ALLOWS you to test your programs on small datasets and GET faster TURNAROUND TIMES.
26.

Which of the following operator is used to view the map reduce execution plans?(a) DUMP(b) DESCRIBE(c) STORE(d) EXPLAINThe question was asked during an interview for a job.The origin of the question is Pig Latin topic in section Pig of Hadoop

Answer» RIGHT answer is (d) EXPLAIN

To explain: EXPLAIN DISPLAYS execution plans.
27.

_________operator is used to review the schema of a relation.(a) DUMP(b) DESCRIBE(c) STORE(d) EXPLAINThis question was posed to me in an interview for internship.This interesting question is from Pig Latin in chapter Pig of Hadoop

Answer»

The CORRECT ANSWER is (B) DESCRIBE

Explanation: DESCRIBE returns the schema of a relation.

28.

Point out the correct statement.(a) During the testing phase of your implementation, you can use LOAD to display results to your terminal screen(b) You can view outer relations as well as relations defined in a nested FOREACH statement(c) Hadoop properties are interpreted by Pig(d) None of the mentionedThis question was posed to me in an online interview.My doubt is from Pig Latin topic in section Pig of Hadoop

Answer»

Correct option is (b) You can VIEW outer relations as well as relations DEFINED in a nested FOREACH statement

To explain I would SAY: Viewing outer relations is POSSIBLE USING DESCRIBE operator.

29.

$ pig -x tez_local … will enable ________ mode in Pig.(a) Mapreduce(b) Tez(c) Local(d) None of the mentionedI have been asked this question in an online quiz.My doubt stems from Introduction to Pig topic in division Pig of Hadoop

Answer»

Right option is (d) None of the mentioned

Best explanation: Tez Local MODE is similar to local mode, EXCEPT internally PIG will invoke tez runtime engine.

30.

Which of the following will run pig in local mode?(a) $ pig -x local …(b) $ pig -x tez_local …(c) $ pig …(d) None of the mentionedThis question was addressed to me in class test.My question is taken from Introduction to Pig topic in portion Pig of Hadoop

Answer»

Correct CHOICE is (a) $ PIG -x LOCAL

For explanation I would say: Specify local mode USING the -x FLAG (pig -x local).

31.

Which of the following is the default mode?(a) Mapreduce(b) Tez(c) Local(d) All of the mentionedI have been asked this question during an interview.My question is based upon Introduction to Pig topic in section Pig of Hadoop

Answer» CORRECT option is (a) Mapreduce

For EXPLANATION: To run PIG in mapreduce mode, you need ACCESS to a HADOOP cluster and HDFS installation.
32.

You can run Pig in interactive mode using the ______ shell.(a) Grunt(b) FS(c) HDFS(d) None of the mentionedI have been asked this question in final exam.This question is from Introduction to Pig topic in section Pig of Hadoop

Answer»

Correct OPTION is (a) GRUNT

To explain I would SAY: INVOKE the Grunt shell using the “pig” command (as shown below) and then enter your Pig Latin statements and Pig commands interactively at the command LINE.

33.

Which of the following function is used to read data in PIG?(a) WRITE(b) READ(c) LOAD(d) None of the mentionedI had been asked this question in exam.My doubt is from Introduction to Pig topic in division Pig of Hadoop

Answer»

Correct CHOICE is (C) LOAD

The explanation is: PigStorage is the default load FUNCTION.

34.

Point out the wrong statement.(a) To run Pig in local mode, you need access to a single machine(b) The DISPLAY operator will display the results to your terminal screen(c) To run Pig in mapreduce mode, you need access to a Hadoop cluster and HDFS installation(d) All of the mentionedI got this question at a job interview.I'd like to ask this question from Introduction to Pig topic in division Pig of Hadoop

Answer»

Right choice is (B) The DISPLAY OPERATOR will display the results to your TERMINAL screen

The BEST I can explain: The DUMP operator will display the results to your terminal screen.

35.

Pig Latin statements are generally organized in one of the following ways?(a) A LOAD statement to read data from the file system(b) A series of “transformation” statements to process the data(c) A DUMP statement to view results or a STORE statement to save the results(d) All of the mentionedThe question was posed to me in semester exam.The doubt is from Introduction to Pig in division Pig of Hadoop

Answer»

Correct option is (d) All of the mentioned

The best I can EXPLAIN: A DUMP or STORE STATEMENT is required to GENERATE OUTPUT.

36.

You can run Pig in batch mode using __________(a) Pig shell command(b) Pig scripts(c) Pig options(d) All of the mentionedI had been asked this question at a job interview.I'd like to ask this question from Introduction to Pig topic in section Pig of Hadoop

Answer»

The CORRECT OPTION is (b) PIG scripts

The EXPLANATION: Pig script contains Pig LATIN statements.

37.

Point out the correct statement.(a) You can run Pig in either mode using the “pig” command(b) You can run Pig in batch mode using the Grunt shell(c) You can run Pig in interactive mode using the FS shell(d) None of the mentionedThis question was addressed to me by my school principal while I was bunking the class.My question is based upon Introduction to Pig in portion Pig of Hadoop

Answer»

Correct choice is (a) You can RUN Pig in either MODE using the “pig” command

Explanation: You can run Pig in either mode using the “pig” command (the bin/pig Perl SCRIPT) or the “java” command (java -CP pig.jar …).

38.

Pig operates in mainly how many nodes?(a) Two(b) Three(c) Four(d) FiveThis question was addressed to me during an interview for a job.This intriguing question comes from Introduction to Pig topic in section Pig of Hadoop

Answer»

The correct choice is (a) Two

To elaborate: You can run PIG (execute Pig Latin STATEMENTS and Pig COMMANDS) using VARIOUS mode: INTERACTIVE and Batch Mode.