37 + Interview Questions in MapReduce in BigData

1.	Partitioning behaves like a hash function.
Answer» PARTITIONING behaves like a HASH function. Choose the correct ANSWER from below list (1)True (2)False Answer:-(1)True

Discussion

2.	Which of the following is not the reducer phase?
Answer» Which of the following is not the REDUCER phase? CHOOSE the CORRECT ANSWER from below list (1)Sort (2)Shuffle (3)Reduce (4)Map Answer:-(4)Map

Discussion

3.	The main objective of combiners is to increase the output value of the
Answer» The main OBJECTIVE of combiners is to INCREASE the OUTPUT value of the mapper. Choose the correct ANSWER from below list (1)True (2)False Answer:-(2)False

Discussion

4.	Which command is used to end a failed job on MapReduce?
Answer» Which command is used to end a failed job on MAPREDUCE? CHOOSE the correct answer from below list (1)TERMINATE (2)kill (3)REMOVE (4)delete Answer:-(2)kill

Discussion

5.	The partition divides the data into segments.
Answer» The partition divides the DATA into segments. Choose the CORRECT answer from below LIST (1)True (2)False Answer:-(1)True

Discussion

6.	Which of the following are not considerations for a MapReduction programme?
Answer» Which of the following are not considerations for a MapReduction programme? Choose the correct answer from below list (1)Knowing the different ways of deploying a MapReduce jobs. (2)Ensuring you have all the JARs in your classpath. (3)Putting together a runtime configuration before running the program in Eclipse. (4)Knowing which VERSION of java is the best for your version of Hadoop Answer:-(1)Knowing the different ways of deploying a MapReduce jobs.

Discussion

7.	Which makes the HDFS unique from other filesystem?
Answer» Which makes the HDFS unique from other filesystem? Choose the correct answer from below list (1)Large amounts of data are laid across the disk in sequential order. (2)The HDFS is expected to have very large files that are far from each other (3)Metadata about each file in the HDFS is KEPT by the data nodes. (4)The name NODE GOES to more places to find all the data blocks Answer:-(1)Large amounts of data are laid across the disk in sequential order.

Discussion

8.	_______________ is the processing unit of Hadoop, using which the data in Hadoop can be processed.
Answer» _______________ is the processing unit of Hadoop, USING which the data in Hadoop can be processed. Choose the correct OPTIONS from below list (1)MAPREDUCE (2)Hive (3)None of the options (4)Hbase Answer:-(1)MapReduce

Discussion

9.	Who introduces MapReduce?
Answer» Who introduces MapReduce? CHOOSE the correct options from below list (1)AMAZON (2)TWITTER (3)Google (4)Facebook Answer:-(3)Google

Discussion

10.	Which of the following are the best testing and debugging practices for
Answer» Which of the following are the best testing and debugging practices for MapReduce jobs? CHOOSE the correct answer from below list (1)Builds a small Hadoop cluster for the sole purpose of debugging and testing MapReduce code. (2)Use proper development techniques, like encapsulation and ABSTRACTION (3)BUILD UNIT test cases that will BEHAVE unpredictably in different Hadoop environment. (4)Ensure test cases run successfully in most of your deployments. Answer:-(1)Builds a small Hadoop cluster for the sole purpose of debugging and testing MapReduce code.

Discussion

11.	The number of maps is usually driven by the total size of
Answer» The number of MAPS is USUALLY driven by the total size of Choose the correct options from below lists (1)Tasks (2)Inputs (3)Outputs Answer:-(2)Inputs

Discussion

12.	______________ decides the number of mappers.
Answer» ______________ decides the NUMBER of mappers. Choose the correct answer from below list (1)Inputs (2)None of the options (3)TASKS (4)Outputs Answer:- (1)Inputs

Discussion

13.	Which of the following are the advantages of MapReduce?
Answer» Which of the FOLLOWING are the advantages of MapReduce? Choose the correct answer from below list (1)both the options (2)You are able to process the data where it is (3)data gets processed PARALLELLY using HADOOP MapReduce and hence the processing becomes FAST. (4)None of the options Answer:- (1)both the options

Discussion

14.	___________________ programming model is designed for processing data in parallel by dividing the work into a set of independent tasks.
Answer» ___________________ programming MODEL is designed for PROCESSING data in parallel by dividing the work into a set of INDEPENDENT tasks. Choose the CORRECT answer from below LIST (1)Pig (2)Hive (3)None of the options (4)MapReduce Answer:- (4)MapReduce

Discussion

15.	This list value goes through a shuffle phase, and the values are given to the reducer. .
Answer» This list value goes through a shuffle phase, and the values are given to the reducer . Choose the correct answer from below list (1)True (2)False Answer:- (1)True

Discussion

16.	Every input is being counted by the map().
Answer» Every INPUT is being counted by the MAP(). Choose the correct answer from below LIST (1)True (2)False Answer:- (1)True

Discussion

17.	Which of the following maps input key/value pairs to a set of intermediate key/value pairs?
Answer» Which of the following maps input key/value pairs to a set of INTERMEDIATE key/value pairs? Choose the CORRECT ANSWER from below list (1)Mapper (2)REDUCER (3)Both Mapper and Reducer Answer:-(1)Mapper

Discussion

18.	Which of the following is true about MapReduce?
Answer» Which of the FOLLOWING is TRUE about MAPREDUCE? Choose the correct answer from below list (1)It provides the resource management (2)An open source data WAREHOUSE system for querying and ANALYZING large datasets stored in hadoop files (3)Data processing layer of hadoop Answer:-(3)Data processing layer of hadoop

Discussion

19.	The nodes in MapReduce are collectively known as ___________.
Answer» The nodes in MapReduce are collectively KNOWN as ___________. CHOOSE the correct options from below LIST (1)Bundle (2)CLUSTER (3)Group (4)Hive Answer:-(2)Cluster

Discussion

20.	MapReduce is a model that processes ________________.
Answer» MAPREDUCE is a MODEL that processes ________________. Choose the correct options from below list (1)Finite data SET (2)SMALL Data set (3)BigData set (4)Infinite data set Answer:-(3)BigData set

Discussion

21.	When did Google published a paper named as MapReduce?
Answer» When did GOOGLE published a paper named as MapReduce? Choose the CORRECT options from below list (1)2004 (2)2001 (3)2010 (4)2002 Answer:-(1)2004

Discussion

22.	Keys from the shuffle output and sort which of the next interface?
Answer» Keys from the shuffle output and SORT which of the NEXT interface? CHOOSE the correct answer from below LIST (1)Configurable (2)ComparableWritable (3)WritableComparable (4)Writable Answer:-(3)WritableComparable

Discussion

23.	Which of the commands below is used to set the number of job reducers
Answer» Which of the commands below is used to SET the number of job reducers Choose the correct answer from below list (1)Job.confNumreduceTasks() (2)Job.confNumreduceTasks(INT) (3)Job.setNumreduceTasks() (4)Job.setNumreduceTasks(int) Answer:-(4)Job.setNumreduceTasks(int)

Discussion

24.	Which OutputFormat is used to write relational databases and databases?
Answer» Which OutputFormat is USED to write relational databases and databases? CHOOSE the correct ANSWER from below LIST (1)DBOutputFormat (2)TextoutputFormat (3)SequenceFileOutputFormat (4)MapFileOutputFormat Answer:-(1)DBOutputFormat

Discussion

25.	Why MapReduce is required in First place?
Answer» Why MapReduce is required in First place? Choose the correct OPTIONS from below LIST (1)None of the options (2)Because of load on the SERVERS (3)Because of network issues. (4)This is because the BIGDATA that is stored in HDFS is not stored in a traditional fashion. Answer:-(4)This is because the BigData that is stored in HDFS is not stored in a traditional fashion.

Discussion

26.	Which of the following are the components of MapReduce components?
Answer» Which of the following are the components of MapReduce components? CHOOSE the correct OPTIONS from below list (1)MapReduceBaseClass (2)Mapper (3)MapClass Answer:-(2)Mapper

Discussion

27.	Which of the following is about Junit.
Answer» Which of the following is about JUNIT. Choose the correct answer from below list (1)It's not a part of standard JAVA class libraries. (2)JUnit is a Java library that is being designed for unit testing. (3)Provides automated testing and VALIDATIONS. (4)All of the options Answer:-(4)All of the options

Discussion

28.	Define the process of spilling in MapReduce
Answer» DEFINE the process of spilling in MapReduce It is a process when we copy the data from memory buffer to disk when the buffer usage reaches a specific threshold size. And this will happen when we have not ENOUGH memory to FIT all of the mapper output. And by default when thread reaches the 80 percent of buffer size is filled it will STARTS spilling. To under stand we will take a example of 100 MB size buffer and it will start spilling once the content of buffer reaches Size OF 80 mb.

Discussion

29.	_______ is a Java library that is being designed for unit testing.
Answer» _______ is a Java library that is being DESIGNED for unit testing. Choose the CORRECT answer from below LIST (1)Junit (2)REST ASSURED (3)Selenium (4)All of the options Answer:- (1)Junit

Discussion

30.	Cloudera has developed a framework for mapreduce known as ______________.
Answer» Cloudera has DEVELOPED a FRAMEWORK for mapreduce known as ______________. Choose the CORRECT answer from below LIST (1)Rest Assured (2)DBUnit (3)Junit (4)MRUnit Answer:- (4)MRUnit

Discussion

31.	What is the correct sequence of data flow
Answer» What is the correct SEQUENCE of data flow a.InputFormat b.Mapper c.Combiner d.Reducer e.Partitioner f.OutputFormat Choose the correct ANSWER from below LIST (1)acdefb (2)abcdfe (3)abcedf (4)abcdef Answer:- (3)abcedf

Discussion

32.	JobContext interface s main class is the Job Class.
Answer» JobContext INTERFACE s main class is the JOB Class. Choose the correct ANSWER from below LIST (1)True (2)False Answer:- (1)True

Discussion

33.	What happens if a number of reducers are set to 0?
Answer» What happens if a number of reducers are set to 0? CHOOSE the correct answer from below list (1)Reduce-only JOB take place (2)Map-only job take place (3)REDUCER output will be the final output Answer:-(2)Map-only job take place

Discussion

34.	In the recovery mode, name node is started to _________.
Answer» In the recovery mode, NAME node is started to _________. Choose the correct OPTIONS from below list (1)Recover data when there is only one metadata storage LOCATION (2)Recover a failed namenode (3)Recover data from one of the metadata storage LOCATIONS (4)Recover a failed datanode Answer:-(1)Recover data when there is only one metadata storage location

Discussion

35.	Which of the following is not a Hadoop output format?
Answer» Which of the following is not a Hadoop output FORMAT? Choose the correct ANSWER from below list (1)DBOutputFormat (2)TextoutputFormat (3)SequenceFileOutputFormat (4)ByteoutputFormat Answer:-(4)ByteoutputFormat

Discussion

36.	Identity Mapper is the default Hadoop mapper.
Answer» Identity Mapper is the DEFAULT Hadoop mapper. Choose the CORRECT answer from below LIST (1)FALSE (2)True Answer:-(2)True

Discussion

37.	The number of tests should be kept to a minimum because each separate test suite requires a mini cluster to be started a
Answer» The number of tests should be kept to a minimum because each SEPARATE test SUITE requires a MINI cluster to be started at the creation of the test Choose the correct answer from below LIST (1)False (2)True Answer:-(2)True

Discussion

Explore topic-wise InterviewSolutions in Current Affairs.

Partitioning behaves like a hash function.

Which of the following is not the reducer phase?

The main objective of combiners is to increase the output value of the

Which command is used to end a failed job on MapReduce?

The partition divides the data into segments.

Which of the following are not considerations for a MapReduction programme?

Which makes the HDFS unique from other filesystem?

_______________ is the processing unit of Hadoop, using which the data in Hadoop can be processed.

Who introduces MapReduce?

Which of the following are the best testing and debugging practices for

The number of maps is usually driven by the total size of

______________ decides the number of mappers.

Which of the following are the advantages of MapReduce?

___________________ programming model is designed for processing data in parallel by dividing the work into a set of independent tasks.

This list value goes through a shuffle phase, and the values are given to the reducer. .

Every input is being counted by the map().

Which of the following maps input key/value pairs to a set of intermediate key/value pairs?

Which of the following is true about MapReduce?

The nodes in MapReduce are collectively known as ___________.

MapReduce is a model that processes ________________.

When did Google published a paper named as MapReduce?

Keys from the shuffle output and sort which of the next interface?

Which of the commands below is used to set the number of job reducers

Which OutputFormat is used to write relational databases and databases?

Why MapReduce is required in First place?

Which of the following are the components of MapReduce components?

Which of the following is about Junit.

Define the process of spilling in MapReduce

_______ is a Java library that is being designed for unit testing.

Cloudera has developed a framework for mapreduce known as ______________.

What is the correct sequence of data flow

JobContext interface s main class is the Job Class.

What happens if a number of reducers are set to 0?

In the recovery mode, name node is started to _________.

Which of the following is not a Hadoop output format?

Identity Mapper is the default Hadoop mapper.

The number of tests should be kept to a minimum because each separate test suite requires a mini cluster to be started a