54 + Interview Questions in Hadoop I O in Hadoop Page 1 InterviewSolution

1.	The __________ codec uses Google’s Snappy compression library.(a) null(b) snappy(c) deflate(d) none of the mentionedThis question was posed to me in exam.The question is from Avro topic in division Hadoop I/O of Hadoop
Answer» The CORRECT choice is (b) SNAPPY Explanation: Snappy is a COMPRESSION LIBRARY developed at Google, and, like many technologies that come from Google, Snappy was designed to be FAST.

Discussion

2.	_____________ are used between blocks to permit efficient splitting of files for MapReduce processing.(a) Codec(b) Data Marker(c) Synchronization markers(d) All of the mentionedThis question was posed to me in a national level competition.My doubt stems from Avro topic in section Hadoop I/O of Hadoop
Answer» Right answer is (C) Synchronization markers Explanation: AVRO INCLUDES a simple OBJECT container file format.

Discussion

3.	________ permits data written by one system to be efficiently sorted by another system.(a) Complex Data type(b) Order(c) Sort Order(d) All of the mentionedThe question was asked in class test.The origin of the question is Avro topic in portion Hadoop I/O of Hadoop
Answer» Correct ANSWER is (c) Sort Order To EXPLAIN: Avro binary-encoded data can be EFFICIENTLY ordered WITHOUT deserializing it to objects.

Discussion

4.	________ instances are encoded using the number of bytes declared in the schema.(a) Fixed(b) Enum(c) Unions(d) MapsI got this question during an interview.I'd like to ask this question from Avro topic in chapter Hadoop I/O of Hadoop
Answer» Right choice is (a) Fixed The explanation is: EXCEPT for UNIONS, the JSON ENCODING is the same as is used to encode field default values.

Discussion

5.	Point out the wrong statement.(a) Record, enums and fixed are named types(b) Unions may immediately contain other unions(c) A namespace is a dot-separated sequence of such names(d) All of the mentionedI got this question during an interview.My doubt stems from Avro in section Hadoop I/O of Hadoop
Answer» Correct CHOICE is (b) Unions MAY immediately contain other unions To EXPLAIN: Unions may not immediately contain other unions.

Discussion

6.	________ are encoded as a series of blocks.(a) Arrays(b) Enum(c) Unions(d) MapsI had been asked this question in an online interview.I'd like to ask this question from Avro topic in section Hadoop I/O of Hadoop
Answer» Correct choice is (a) Arrays Easy EXPLANATION: Each block of the array CONSISTS of a long COUNT VALUE, followed by that many array items. A block with count zero INDICATES the end of the array. Each item is encoded per the array’s item schema.

Discussion

7.	Avro supports ______ kinds of complex types.(a) 3(b) 4(c) 6(d) 7I got this question in my homework.I need to ask this question from Avro topic in division Hadoop I/O of Hadoop
Answer» Correct option is (d) 7 To ELABORATE: AVRO supports six kinds of complex types: records, enums, ARRAYS, MAPS, unions and fixed.

Discussion

8.	Point out the correct statement.(a) Records use the type name “record” and support three attributes(b) Enum are represented using JSON arrays(c) Avro data is always serialized with its schema(d) All of the mentionedThe question was asked in semester exam.Asked question is from Avro in division Hadoop I/O of Hadoop
Answer» Correct option is (a) RECORDS use the type name “record” and support THREE attributes Best explanation: A record is encoded by encoding the VALUES of its fields in the order that they are DECLARED.

Discussion

9.	Which of the following is a primitive data type in Avro?(a) null(b) boolean(c) float(d) all of the mentionedThis question was posed to me in an international level competition.My enquiry is from Avro in division Hadoop I/O of Hadoop
Answer» Right option is (d) all of the mentioned For explanation: PRIMITIVE TYPE names are ALSO DEFINED type names.

Discussion

10.	________ are a way of encoding structured data in an efficient yet extensible format.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI got this question in my homework.My question is based upon Avro topic in chapter Hadoop I/O of Hadoop
Answer» Correct choice is (b) Protocol Buffers The EXPLANATION is: Google USES Protocol Buffers for almost all of its internal RPC protocols and FILE formats.

Discussion

11.	When using reflection to automatically build our schemas without code generation, we need to configure Avro using?(a) AvroJob.Reflect(jConf);(b) AvroJob.setReflect(jConf);(c) Job.setReflect(jConf);(d) None of the mentionedI got this question during an interview.My query is from Avro topic in portion Hadoop I/O of Hadoop
Answer» Correct ANSWER is (c) Job.setReflect(jConf); To explain: For strongly typed languages like Java, it also provides a generation CODE layer, INCLUDING RPC services code generation.

Discussion

12.	Avro is said to be the future _______ layer of Hadoop.(a) RMC(b) RPC(c) RDC(d) All of the mentionedThe question was asked in examination.This question is from Avro in division Hadoop I/O of Hadoop
Answer» RIGHT ANSWER is (b) RPC The best I can explain: When Avro is used in RPC, the client and server exchange schemas in the CONNECTION HANDSHAKE.

Discussion

13.	Thrift resolves possible conflicts through _________ of the field.(a) Name(b) Static number(c) UID(d) None of the mentionedI had been asked this question during an online interview.This interesting question is from Avro in section Hadoop I/O of Hadoop
Answer» Correct OPTION is (B) Static number The explanation: Avro RESOLVES POSSIBLE conflicts through the name of the FIELD.

Discussion

14.	Point out the wrong statement.(a) Apache Avro™ is a data serialization system(b) Avro provides simple integration with dynamic languages(c) Avro provides rich data structures(d) All of the mentionedThis question was addressed to me in a job interview.Question is taken from Avro topic in division Hadoop I/O of Hadoop
Answer» Correct choice is (d) All of the mentioned Easiest EXPLANATION: Code GENERATION is not required to READ or write data files nor to use or implement RPC protocols in Avro.

Discussion

15.	With ______ we can store data and read it easily with various programming languages.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI had been asked this question in homework.I would like to ask this question from Avro in portion Hadoop I/O of Hadoop
Answer» The CORRECT option is (c) Avro The explanation: Avro is optimized to MINIMIZE the DISK space needed by our data and it is FLEXIBLE.

Discussion

16.	__________ facilitates construction of generic data-processing systems and languages.(a) Untagged data(b) Dynamic typing(c) No manually-assigned field IDs(d) All of the mentionedThe question was posed to me in an interview.My doubt is from Avro topic in section Hadoop I/O of Hadoop
Answer» Right CHOICE is (B) Dynamic typing For EXPLANATION I would say: AVRO does not require that CODE be generated.

Discussion

17.	Point out the correct statement.(a) Avro provides functionality similar to systems such as Thrift(b) When Avro is used in RPC, the client and server exchange data in the connection handshake(c) Apache Avro, Avro, Apache, and the Avro and Apache logos are trademarks of The Java Foundation(d) None of the mentionedI have been asked this question during an online interview.My doubt is from Avro topic in portion Hadoop I/O of Hadoop
Answer» The CORRECT answer is (a) AVRO provides FUNCTIONALITY similar to systems such as Thrift Easy explanation: Avro DIFFERS from these systems in the fundamental aspects like untagged data.

Discussion

18.	Avro schemas are defined with _____(a) JSON(b) XML(c) JAVA(d) All of the mentionedThis question was addressed to me in quiz.Origin of the question is Avro in section Hadoop I/O of Hadoop
Answer» RIGHT choice is (a) JSON For explanation I WOULD say: JSON implementation FACILITATES implementation in languages that already have JSON libraries.

Discussion

19.	Which of the following works well with Avro?(a) Lucene(b) kafka(c) MapReduce(d) None of the mentionedI had been asked this question by my school teacher while I was bunking the class.This interesting question is from Serialization topic in section Hadoop I/O of Hadoop
Answer» The correct ANSWER is (c) MAPREDUCE To explain: You can use Avro and MapReduce together to process many items serialized with Avro’s SMALL binary FORMAT.

Discussion

20.	The ________ method in the ModelCountReducer class “reduces” the values the mapper collects into a derived value.(a) count(b) add(c) reduce(d) all of the mentionedThe question was asked in an online interview.My question is based upon Serialization topic in division Hadoop I/O of Hadoop
Answer» CORRECT option is (c) reduce Explanation: In some cases, it can be a SIMPLE SUM of the values.

Discussion

21.	____________ class accepts the values that the ModelCountMapper object has collected.(a) AvroReducer(b) Mapper(c) AvroMapper(d) None of the mentionedThis question was posed to me during an internship interview.The origin of the question is Serialization topic in section Hadoop I/O of Hadoop
Answer» RIGHT answer is (a) AVROREDUCER Explanation: AvroReducer SUMMARIZES them by looping through the VALUES.

Discussion

22.	The ____________ class extends and implements several Hadoop-supplied interfaces.(a) AvroReducer(b) Mapper(c) AvroMapper(d) None of the mentionedThe question was asked by my school principal while I was bunking the class.This intriguing question comes from Serialization topic in section Hadoop I/O of Hadoop
Answer» The correct ANSWER is (c) AvroMapper To explain I WOULD say: AvroMapper is used to PROVIDE the ability to COLLECT or map DATA.

Discussion

23.	Point out the wrong statement.(a) Java code is used to deserialize the contents of the file into objects(b) Avro allows you to use complex data structures within Hadoop MapReduce jobs(c) The m2e plugin automatically downloads the newly added JAR files and their dependencies(d) None of the mentionedI have been asked this question in exam.This interesting question is from Serialization in portion Hadoop I/O of Hadoop
Answer» The correct answer is (d) None of the mentioned Easiest explanation: A UNIT test is useful because you can make ASSERTIONS to verify that the VALUES of the deserialized OBJECT are the same as the original values.

Discussion

24.	Avro schemas describe the format of the message and are defined using ______________(a) JSON(b) XML(c) JS(d) All of the mentionedThe question was asked during an interview for a job.Query is from Serialization topic in portion Hadoop I/O of Hadoop
Answer» The CORRECT OPTION is (a) JSON Best explanation: The JSON schema CONTENT is PUT into a FILE.

Discussion

25.	The ____________ is an iterator which reads through the file and returns objects using the next() method.(a) DatReader(b) DatumReader(c) DatumRead(d) None of the mentionedThis question was posed to me in an interview.I would like to ask this question from Serialization in portion Hadoop I/O of Hadoop
Answer» Correct CHOICE is (b) DATUMREADER Easiest explanation: DatumReader reads the CONTENT through the DataFileReader implementation.

Discussion

26.	Point out the correct statement.(a) Apache Avro is a framework that allows you to serialize data in a format that has a schema built in(b) The serialized data is in a compact binary format that doesn’t require proxy objects or code generation(c) Including schemas with the Avro messages allows any application to deserialize the data(d) All of the mentionedThis question was posed to me in an interview.This is a very interesting question from Serialization topic in section Hadoop I/O of Hadoop
Answer» Correct answer is (d) All of the mentioned Easy EXPLANATION: INSTEAD of using generated proxy libraries and strong typing, Avro RELIES heavily on the SCHEMAS that are sent ALONG with the serialized data.

Discussion

27.	Apache _______ is a serialization framework that produces data in a compact binary format.(a) Oozie(b) Impala(c) kafka(d) AvroThe question was posed to me during an online exam.The above asked question is from Serialization in portion Hadoop I/O of Hadoop
Answer» Correct CHOICE is (d) Avro For explanation: APACHE Avro doesn’t REQUIRE PROXY objects or code generation.

Discussion

28.	_________ stores its metadata on multiple disks that typically include a non-local file server.(a) DataNode(b) NameNode(c) ActionNode(d) None of the mentionedI have been asked this question in examination.Question is taken from Data Integrity topic in section Hadoop I/O of Hadoop
Answer» RIGHT answer is (b) NameNode The BEST I can explain: HDFS tolerates FAILURES of storage servers (called DATANODES) and its disks.

Discussion

29.	HDFS, by default, replicates each data block _____ times on different nodes and on at least ____ racks.(a) 3, 2(b) 1, 2(c) 2, 3(d) All of the mentionedI have been asked this question in unit test.My doubt is from Data Integrity in division Hadoop I/O of Hadoop
Answer» The correct CHOICE is (a) 3, 2 The best I can explain: HDFS has a simple yet robust architecture that was explicitly designed for DATA reliability in the face of faults and FAILURES in disks, nodes and NETWORKS.

Discussion

30.	Automatic restart and ____________ of the NameNode software to another machine is not supported.(a) failover(b) end(c) scalability(d) all of the mentionedI have been asked this question by my school teacher while I was bunking the class.I'm obligated to ask this question of Data Integrity in portion Hadoop I/O of Hadoop
Answer» The correct option is (a) failover To explain I would SAY: If the NAMENODE MACHINE fails, manual intervention is NECESSARY.

Discussion

31.	Point out the wrong statement.(a) HDFS is designed to support small files only(b) Any update to either the FsImage or EditLog causes each of the FsImages and EditLogs to get updated synchronously(c) NameNode can be configured to support maintaining multiple copies of the FsImage and EditLog(d) None of the mentionedI have been asked this question during an interview.This is a very interesting question from Data Integrity topic in chapter Hadoop I/O of Hadoop
Answer» Right OPTION is (a) HDFS is designed to support SMALL FILES only For explanation: HDFS is designed to support very large files.

Discussion

32.	__________ support storing a copy of data at a particular instant of time.(a) Data Image(b) Datanots(c) Snapshots(d) All of the mentionedI had been asked this question during an interview for a job.Query is from Data Integrity topic in section Hadoop I/O of Hadoop
Answer» The CORRECT CHOICE is (C) Snapshots To elaborate: ONE usage of the snapshot feature may be to roll back a corrupted HDFS instance to a previously known GOOD point in time.

Discussion

33.	The ____________ and the EditLog are central data structures of HDFS.(a) DsImage(b) FsImage(c) FsImages(d) All of the mentionedI have been asked this question in an online quiz.I'd like to ask this question from Data Integrity topic in division Hadoop I/O of Hadoop
Answer» CORRECT choice is (B) FsImage Explanation: A CORRUPTION of these FILES can cause the HDFS instance to be non-functional.

Discussion

34.	The ___________ machine is a single point of failure for an HDFS cluster.(a) DataNode(b) NameNode(c) ActionNode(d) All of the mentionedThis question was posed to me in an interview.My question is taken from Data Integrity topic in division Hadoop I/O of Hadoop
Answer» The CORRECT option is (B) NameNode Explanation: If the NameNode machine fails, manual intervention is necessary. Currently, automatic RESTART and failover of the NameNode software to ANOTHER machine is not SUPPORTED.

Discussion

35.	Point out the correct statement.(a) The HDFS architecture is compatible with data rebalancing schemes(b) Datablocks support storing a copy of data at a particular instant of time(c) HDFS currently support snapshots(d) None of the mentionedThe question was posed to me during an online exam.Question is taken from Data Integrity topic in division Hadoop I/O of Hadoop
Answer» Right choice is (a) The HDFS architecture is compatible with DATA rebalancing schemes The EXPLANATION: A scheme might automatically move data from one DATANODE to another if the free SPACE on a DataNode FALLS below a certain threshold.

Discussion

36.	The HDFS client software implements __________ checking on the contents of HDFS files.(a) metastore(b) parity(c) checksum(d) none of the mentionedI got this question in a national level competition.This intriguing question originated from Data Integrity topic in portion Hadoop I/O of Hadoop
Answer» CORRECT OPTION is (c) checksum Explanation: When a client CREATES an HDFS FILE, it computes a checksum of each BLOCK of the file and stores these checksums in a separate hidden file in the same HDFS namespace.

Discussion

37.	__________typically compresses files to within 10% to 15% of the best available techniques.(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI got this question by my school teacher while I was bunking the class.My doubt is from Compression in chapter Hadoop I/O of Hadoop
Answer» Right option is (b) BZIP2 Explanation: bzip2 is a FREELY available, patent FREE (SEE below), high-quality data compressor.

Discussion

38.	Gzip (short for GNU zip) generates compressed files that have a _________ extension.(a) .gzip(b) .gz(c) .gzp(d) .gThe question was posed to me in examination.This interesting question is from Compression topic in chapter Hadoop I/O of Hadoop
Answer» Right OPTION is (b) .gz Best explanation: You can use the GUNZIP command to DECOMPRESS files that were CREATED by a number of compression utilities, including Gzip.

Discussion

39.	Which of the following is based on the DEFLATE algorithm?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in an international level competition.The origin of the question is Compression topic in chapter Hadoop I/O of Hadoop
Answer» CORRECT answer is (C) GZIP The best I can explain: gzip is based on the DEFLATE ALGORITHM, which is a combination of LZ77 and Huffman Coding.

Discussion

40.	Which of the following is the slowest compression technique?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI have been asked this question in class test.My question is based upon Compression in chapter Hadoop I/O of Hadoop
Answer» RIGHT answer is (b) Bzip2 To EXPLAIN I would say: Of all the AVAILABLE compression CODECS in Hadoop, Bzip2 is by far the SLOWEST.

Discussion

41.	Point out the wrong statement.(a) From a usability standpoint, LZO and Gzip are similar(b) Bzip2 generates a better compression ratio than does Gzip, but it’s much slower(c) Gzip is a compression utility that was adopted by the GNU project(d) None of the mentionedThe question was posed to me in a job interview.The above asked question is from Compression in division Hadoop I/O of Hadoop
Answer» Correct option is (a) From a usability standpoint, LZO and Gzip are similar To EXPLAIN I would say: From a usability standpoint, Bzip2 and Gzip are similar.

Discussion

42.	Which of the following supports splittable compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in quiz.The origin of the question is Compression in section Hadoop I/O of Hadoop
Answer» Correct answer is (a) LZO Easiest explanation: LZO ENABLES the parallel PROCESSING of COMPRESSED text FILE splits by your MapReduce jobs.

Discussion

43.	Which of the following compression is similar to Snappy compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was posed to me in an interview.I want to ask this question from Compression in chapter Hadoop I/O of Hadoop
Answer» RIGHT option is (a) LZO The explanation: LZO is only really DESIRABLE if you need to compress TEXT FILES.

Discussion

44.	Point out the correct statement.(a) Snappy is licensed under the GNU Public License (GPL)(b) BgCIK needs to create an index when it compresses a file(c) The Snappy codec is integrated into Hadoop Common, a set of common utilities that supports other Hadoop subprojects(d) None of the mentionedI have been asked this question in a national level competition.My question comes from Compression topic in portion Hadoop I/O of Hadoop
Answer» The correct option is (c) The Snappy CODEC is integrated into Hadoop Common, a set of common utilities that SUPPORTS other Hadoop subprojects For explanation I would SAY: You can use Snappy as an add-on for more RECENT versions of Hadoop that do not YET provide Snappy codec support.

Discussion

45.	The _________ codec from Google provides modest compression ratios.(a) Snapcheck(b) Snappy(c) FileCompress(d) None of the mentionedI have been asked this question by my school principal while I was bunking the class.I want to ask this question from Compression topic in division Hadoop I/O of Hadoop
Answer» CORRECT answer is (B) SNAPPY To explain: Snappy has fast compression and decompression speeds.

Discussion

46.	The _________ as just the value field append(value) and the key is a LongWritable that contains the record number, count + 1.(a) SetFile(b) ArrayFile(c) BloomMapFile(d) None of the mentionedThe question was posed to me at a job interview.Question is from Hadoop I/O topic in division Hadoop I/O of Hadoop
Answer» The correct option is (b) ArrayFile To explain: The SetFile INSTEAD of append(KEY, VALUE) as just the key field append(key) and the value is always the NullWritable instance.

Discussion

47.	The ______ file is populated with the key and a LongWritable that contains the starting byte position of the record.(a) Array(b) Index(c) Immutable(d) All of the mentionedI got this question during a job interview.I would like to ask this question from Hadoop I/O topic in chapter Hadoop I/O of Hadoop
Answer» Right CHOICE is (b) INDEX Best EXPLANATION: Index doesn’t CONTAINS all the KEYS but just a fraction of the keys.

Discussion

48.	The __________ is a directory that contains two SequenceFile.(a) ReduceFile(b) MapperFile(c) MapFile(d) None of the mentionedI got this question in an interview for job.I need to ask this question from Hadoop I/O topic in chapter Hadoop I/O of Hadoop
Answer» RIGHT option is (c) MapFile For explanation I would SAY: SEQUENCE FILES are data file (“/data”) and the INDEX file (“/index”).

Discussion

49.	Which of the following format is more compression-aggressive?(a) Partition Compressed(b) Record Compressed(c) Block-Compressed(d) UncompressedI got this question in exam.I'd like to ask this question from Hadoop I/O topic in section Hadoop I/O of Hadoop
Answer» The correct ANSWER is (c) Block-Compressed Explanation: SEQUENCEFILE key-value LIST can be just a Text/Text pair, and is written to the file during the INITIALIZATION that happens in the SequenceFile.

Discussion

50.	Point out the wrong statement.(a) The data file contains all the key, value records but key N + 1 must be greater than or equal to the key N(b) Sequence file is a kind of hadoop file based data structure(c) Map file type is splittable as it contains a sync point after several records(d) None of the mentionedThe question was posed to me by my college director while I was bunking the class.This interesting question is from Hadoop I/O in section Hadoop I/O of Hadoop
Answer» Correct answer is (c) MAP file type is splittable as it contains a SYNC point after several records Easiest EXPLANATION: Map file is again a kind of hadoop file based data structure and it differs from a SEQUENCE file in a matter of the order.

Discussion

Explore topic-wise InterviewSolutions in .

The __________ codec uses Google’s Snappy compression library.(a) null(b) snappy(c) deflate(d) none of the mentionedThis question was posed to me in exam.The question is from Avro topic in division Hadoop I/O of Hadoop

________ permits data written by one system to be efficiently sorted by another system.(a) Complex Data type(b) Order(c) Sort Order(d) All of the mentionedThe question was asked in class test.The origin of the question is Avro topic in portion Hadoop I/O of Hadoop

________ instances are encoded using the number of bytes declared in the schema.(a) Fixed(b) Enum(c) Unions(d) MapsI got this question during an interview.I'd like to ask this question from Avro topic in chapter Hadoop I/O of Hadoop

Point out the wrong statement.(a) Record, enums and fixed are named types(b) Unions may immediately contain other unions(c) A namespace is a dot-separated sequence of such names(d) All of the mentionedI got this question during an interview.My doubt stems from Avro in section Hadoop I/O of Hadoop

________ are encoded as a series of blocks.(a) Arrays(b) Enum(c) Unions(d) MapsI had been asked this question in an online interview.I'd like to ask this question from Avro topic in section Hadoop I/O of Hadoop

Avro supports ______ kinds of complex types.(a) 3(b) 4(c) 6(d) 7I got this question in my homework.I need to ask this question from Avro topic in division Hadoop I/O of Hadoop

Which of the following is a primitive data type in Avro?(a) null(b) boolean(c) float(d) all of the mentionedThis question was posed to me in an international level competition.My enquiry is from Avro in division Hadoop I/O of Hadoop

________ are a way of encoding structured data in an efficient yet extensible format.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI got this question in my homework.My question is based upon Avro topic in chapter Hadoop I/O of Hadoop

Avro is said to be the future _______ layer of Hadoop.(a) RMC(b) RPC(c) RDC(d) All of the mentionedThe question was asked in examination.This question is from Avro in division Hadoop I/O of Hadoop

Thrift resolves possible conflicts through _________ of the field.(a) Name(b) Static number(c) UID(d) None of the mentionedI had been asked this question during an online interview.This interesting question is from Avro in section Hadoop I/O of Hadoop

With ______ we can store data and read it easily with various programming languages.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI had been asked this question in homework.I would like to ask this question from Avro in portion Hadoop I/O of Hadoop

__________ facilitates construction of generic data-processing systems and languages.(a) Untagged data(b) Dynamic typing(c) No manually-assigned field IDs(d) All of the mentionedThe question was posed to me in an interview.My doubt is from Avro topic in section Hadoop I/O of Hadoop

Avro schemas are defined with _____(a) JSON(b) XML(c) JAVA(d) All of the mentionedThis question was addressed to me in quiz.Origin of the question is Avro in section Hadoop I/O of Hadoop

Which of the following works well with Avro?(a) Lucene(b) kafka(c) MapReduce(d) None of the mentionedI had been asked this question by my school teacher while I was bunking the class.This interesting question is from Serialization topic in section Hadoop I/O of Hadoop

The ________ method in the ModelCountReducer class “reduces” the values the mapper collects into a derived value.(a) count(b) add(c) reduce(d) all of the mentionedThe question was asked in an online interview.My question is based upon Serialization topic in division Hadoop I/O of Hadoop

____________ class accepts the values that the ModelCountMapper object has collected.(a) AvroReducer(b) Mapper(c) AvroMapper(d) None of the mentionedThis question was posed to me during an internship interview.The origin of the question is Serialization topic in section Hadoop I/O of Hadoop

Avro schemas describe the format of the message and are defined using ______________(a) JSON(b) XML(c) JS(d) All of the mentionedThe question was asked during an interview for a job.Query is from Serialization topic in portion Hadoop I/O of Hadoop

Apache _______ is a serialization framework that produces data in a compact binary format.(a) Oozie(b) Impala(c) kafka(d) AvroThe question was posed to me during an online exam.The above asked question is from Serialization in portion Hadoop I/O of Hadoop

_________ stores its metadata on multiple disks that typically include a non-local file server.(a) DataNode(b) NameNode(c) ActionNode(d) None of the mentionedI have been asked this question in examination.Question is taken from Data Integrity topic in section Hadoop I/O of Hadoop

HDFS, by default, replicates each data block _ times on different nodes and on at least racks.(a) 3, 2(b) 1, 2(c) 2, 3(d) All of the mentionedI have been asked this question in unit test.My doubt is from Data Integrity in division Hadoop I/O of Hadoop

__________ support storing a copy of data at a particular instant of time.(a) Data Image(b) Datanots(c) Snapshots(d) All of the mentionedI had been asked this question during an interview for a job.Query is from Data Integrity topic in section Hadoop I/O of Hadoop

The ____________ and the EditLog are central data structures of HDFS.(a) DsImage(b) FsImage(c) FsImages(d) All of the mentionedI have been asked this question in an online quiz.I'd like to ask this question from Data Integrity topic in division Hadoop I/O of Hadoop

The ___________ machine is a single point of failure for an HDFS cluster.(a) DataNode(b) NameNode(c) ActionNode(d) All of the mentionedThis question was posed to me in an interview.My question is taken from Data Integrity topic in division Hadoop I/O of Hadoop

The HDFS client software implements __________ checking on the contents of HDFS files.(a) metastore(b) parity(c) checksum(d) none of the mentionedI got this question in a national level competition.This intriguing question originated from Data Integrity topic in portion Hadoop I/O of Hadoop

__________typically compresses files to within 10% to 15% of the best available techniques.(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI got this question by my school teacher while I was bunking the class.My doubt is from Compression in chapter Hadoop I/O of Hadoop

Gzip (short for GNU zip) generates compressed files that have a _________ extension.(a) .gzip(b) .gz(c) .gzp(d) .gThe question was posed to me in examination.This interesting question is from Compression topic in chapter Hadoop I/O of Hadoop

Which of the following is based on the DEFLATE algorithm?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in an international level competition.The origin of the question is Compression topic in chapter Hadoop I/O of Hadoop

Which of the following is the slowest compression technique?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI have been asked this question in class test.My question is based upon Compression in chapter Hadoop I/O of Hadoop

Which of the following supports splittable compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in quiz.The origin of the question is Compression in section Hadoop I/O of Hadoop

Which of the following compression is similar to Snappy compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was posed to me in an interview.I want to ask this question from Compression in chapter Hadoop I/O of Hadoop

The _________ codec from Google provides modest compression ratios.(a) Snapcheck(b) Snappy(c) FileCompress(d) None of the mentionedI have been asked this question by my school principal while I was bunking the class.I want to ask this question from Compression topic in division Hadoop I/O of Hadoop

The __________ is a directory that contains two SequenceFile.(a) ReduceFile(b) MapperFile(c) MapFile(d) None of the mentionedI got this question in an interview for job.I need to ask this question from Hadoop I/O topic in chapter Hadoop I/O of Hadoop

Which of the following format is more compression-aggressive?(a) Partition Compressed(b) Record Compressed(c) Block-Compressed(d) UncompressedI got this question in exam.I'd like to ask this question from Hadoop I/O topic in section Hadoop I/O of Hadoop

Explore topic-wise InterviewSolutions in .

The __________ codec uses Google’s Snappy compression library.(a) null(b) snappy(c) deflate(d) none of the mentionedThis question was posed to me in exam.The question is from Avro topic in division Hadoop I/O of Hadoop

________ permits data written by one system to be efficiently sorted by another system.(a) Complex Data type(b) Order(c) Sort Order(d) All of the mentionedThe question was asked in class test.The origin of the question is Avro topic in portion Hadoop I/O of Hadoop

________ instances are encoded using the number of bytes declared in the schema.(a) Fixed(b) Enum(c) Unions(d) MapsI got this question during an interview.I'd like to ask this question from Avro topic in chapter Hadoop I/O of Hadoop

Point out the wrong statement.(a) Record, enums and fixed are named types(b) Unions may immediately contain other unions(c) A namespace is a dot-separated sequence of such names(d) All of the mentionedI got this question during an interview.My doubt stems from Avro in section Hadoop I/O of Hadoop

________ are encoded as a series of blocks.(a) Arrays(b) Enum(c) Unions(d) MapsI had been asked this question in an online interview.I'd like to ask this question from Avro topic in section Hadoop I/O of Hadoop

Avro supports ______ kinds of complex types.(a) 3(b) 4(c) 6(d) 7I got this question in my homework.I need to ask this question from Avro topic in division Hadoop I/O of Hadoop

Which of the following is a primitive data type in Avro?(a) null(b) boolean(c) float(d) all of the mentionedThis question was posed to me in an international level competition.My enquiry is from Avro in division Hadoop I/O of Hadoop

________ are a way of encoding structured data in an efficient yet extensible format.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI got this question in my homework.My question is based upon Avro topic in chapter Hadoop I/O of Hadoop

Avro is said to be the future _______ layer of Hadoop.(a) RMC(b) RPC(c) RDC(d) All of the mentionedThe question was asked in examination.This question is from Avro in division Hadoop I/O of Hadoop

Thrift resolves possible conflicts through _________ of the field.(a) Name(b) Static number(c) UID(d) None of the mentionedI had been asked this question during an online interview.This interesting question is from Avro in section Hadoop I/O of Hadoop

With ______ we can store data and read it easily with various programming languages.(a) Thrift(b) Protocol Buffers(c) Avro(d) None of the mentionedI had been asked this question in homework.I would like to ask this question from Avro in portion Hadoop I/O of Hadoop

__________ facilitates construction of generic data-processing systems and languages.(a) Untagged data(b) Dynamic typing(c) No manually-assigned field IDs(d) All of the mentionedThe question was posed to me in an interview.My doubt is from Avro topic in section Hadoop I/O of Hadoop

Avro schemas are defined with _____(a) JSON(b) XML(c) JAVA(d) All of the mentionedThis question was addressed to me in quiz.Origin of the question is Avro in section Hadoop I/O of Hadoop

Which of the following works well with Avro?(a) Lucene(b) kafka(c) MapReduce(d) None of the mentionedI had been asked this question by my school teacher while I was bunking the class.This interesting question is from Serialization topic in section Hadoop I/O of Hadoop

The ________ method in the ModelCountReducer class “reduces” the values the mapper collects into a derived value.(a) count(b) add(c) reduce(d) all of the mentionedThe question was asked in an online interview.My question is based upon Serialization topic in division Hadoop I/O of Hadoop

____________ class accepts the values that the ModelCountMapper object has collected.(a) AvroReducer(b) Mapper(c) AvroMapper(d) None of the mentionedThis question was posed to me during an internship interview.The origin of the question is Serialization topic in section Hadoop I/O of Hadoop

Avro schemas describe the format of the message and are defined using ______________(a) JSON(b) XML(c) JS(d) All of the mentionedThe question was asked during an interview for a job.Query is from Serialization topic in portion Hadoop I/O of Hadoop

Apache _______ is a serialization framework that produces data in a compact binary format.(a) Oozie(b) Impala(c) kafka(d) AvroThe question was posed to me during an online exam.The above asked question is from Serialization in portion Hadoop I/O of Hadoop

_________ stores its metadata on multiple disks that typically include a non-local file server.(a) DataNode(b) NameNode(c) ActionNode(d) None of the mentionedI have been asked this question in examination.Question is taken from Data Integrity topic in section Hadoop I/O of Hadoop

HDFS, by default, replicates each data block _____ times on different nodes and on at least ____ racks.(a) 3, 2(b) 1, 2(c) 2, 3(d) All of the mentionedI have been asked this question in unit test.My doubt is from Data Integrity in division Hadoop I/O of Hadoop

__________ support storing a copy of data at a particular instant of time.(a) Data Image(b) Datanots(c) Snapshots(d) All of the mentionedI had been asked this question during an interview for a job.Query is from Data Integrity topic in section Hadoop I/O of Hadoop

The ____________ and the EditLog are central data structures of HDFS.(a) DsImage(b) FsImage(c) FsImages(d) All of the mentionedI have been asked this question in an online quiz.I'd like to ask this question from Data Integrity topic in division Hadoop I/O of Hadoop

The ___________ machine is a single point of failure for an HDFS cluster.(a) DataNode(b) NameNode(c) ActionNode(d) All of the mentionedThis question was posed to me in an interview.My question is taken from Data Integrity topic in division Hadoop I/O of Hadoop

The HDFS client software implements __________ checking on the contents of HDFS files.(a) metastore(b) parity(c) checksum(d) none of the mentionedI got this question in a national level competition.This intriguing question originated from Data Integrity topic in portion Hadoop I/O of Hadoop

__________typically compresses files to within 10% to 15% of the best available techniques.(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI got this question by my school teacher while I was bunking the class.My doubt is from Compression in chapter Hadoop I/O of Hadoop

Gzip (short for GNU zip) generates compressed files that have a _________ extension.(a) .gzip(b) .gz(c) .gzp(d) .gThe question was posed to me in examination.This interesting question is from Compression topic in chapter Hadoop I/O of Hadoop

Which of the following is based on the DEFLATE algorithm?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in an international level competition.The origin of the question is Compression topic in chapter Hadoop I/O of Hadoop

Which of the following is the slowest compression technique?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedI have been asked this question in class test.My question is based upon Compression in chapter Hadoop I/O of Hadoop

Which of the following supports splittable compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was addressed to me in quiz.The origin of the question is Compression in section Hadoop I/O of Hadoop

Which of the following compression is similar to Snappy compression?(a) LZO(b) Bzip2(c) Gzip(d) All of the mentionedThis question was posed to me in an interview.I want to ask this question from Compression in chapter Hadoop I/O of Hadoop

The _________ codec from Google provides modest compression ratios.(a) Snapcheck(b) Snappy(c) FileCompress(d) None of the mentionedI have been asked this question by my school principal while I was bunking the class.I want to ask this question from Compression topic in division Hadoop I/O of Hadoop

The __________ is a directory that contains two SequenceFile.(a) ReduceFile(b) MapperFile(c) MapFile(d) None of the mentionedI got this question in an interview for job.I need to ask this question from Hadoop I/O topic in chapter Hadoop I/O of Hadoop

Which of the following format is more compression-aggressive?(a) Partition Compressed(b) Record Compressed(c) Block-Compressed(d) UncompressedI got this question in exam.I'd like to ask this question from Hadoop I/O topic in section Hadoop I/O of Hadoop

HDFS, by default, replicates each data block _ times on different nodes and on at least racks.(a) 3, 2(b) 1, 2(c) 2, 3(d) All of the mentionedI have been asked this question in unit test.My doubt is from Data Integrity in division Hadoop I/O of Hadoop