1.

Point out the wrong statement.(a) Hadoop works better with a small number of large files than a large number of small files(b) CombineFileInputFormat is designed to work well with small files(c) CombineFileInputFormat does not compromise the speed at which it can process the input in a typical MapReduce job(d) None of the mentioned

Answer» The correct option is (c) CombineFileInputFormat does not compromise the speed at which it can process the input in a typical MapReduce job

Easy explanation: If the file is very small (“small” means significantly smaller than an HDFS block) and there are a lot of them, then each map task will process very little input, and there will be a lot of them (one per file), each of which imposes extra bookkeeping overhead.


Discussion

No Comment Found

Related InterviewSolutions