InterviewSolution
Saved Bookmarks
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 201. |
Point out the correct statement.(a) Apache Avro is a framework that allows you to serialize data in a format that has a schema built in(b) The serialized data is in a compact binary format that doesn’t require proxy objects or code generation(c) Including schemas with the Avro messages allows any application to deserialize the data(d) All of the mentioned |
|
Answer» Correct answer is (d) All of the mentioned Easy explanation: Instead of using generated proxy libraries and strong typing, Avro relies heavily on the schemas that are sent along with the serialized data. |
|
| 202. |
Which of the following is a configuration management system?(a) Alex(b) Puppet(c) Acem(d) None of the mentioned |
|
Answer» Right answer is (b) Puppet The best explanation: Administrators may use configuration management systems such as Puppet and Chef to manage processes. |
|
| 203. |
Avro schemas describe the format of the message and are defined using ______________(a) JSON(b) XML(c) JS(d) All of the mentioned |
|
Answer» The correct option is (a) JSON Best explanation: The JSON schema content is put into a file. |
|
| 204. |
Point out the correct statement.(a) RAID is turned off by default(b) Hadoop is designed to be a highly redundant distributed system(c) Hadoop has a networked configuration system(d) None of the mentioned |
|
Answer» The correct option is (b) Hadoop is designed to be a highly redundant distributed system To elaborate: Hadoop deployment is sometimes difficult to implement. |
|
| 205. |
___________ mode allows you to suppress alerts for a host, service, role, or even the entire cluster.(a) Safe(b) Maintenance(c) Secure(d) All of the mentioned |
|
Answer» The correct choice is (b) Maintenance For explanation: Maintenance mode can be useful when you need to take actions in your cluster and do not want to see the alerts that will be generated due to those actions. |
|
| 206. |
Which of the following is a primitive data type in Avro?(a) null(b) boolean(c) float(d) all of the mentioned |
|
Answer» Right option is (d) all of the mentioned For explanation: Primitive type names are also defined type names. |
|
| 207. |
Point out the correct statement.(a) Records use the type name “record” and support three attributes(b) Enum are represented using JSON arrays(c) Avro data is always serialized with its schema(d) All of the mentioned |
|
Answer» Correct option is (a) Records use the type name “record” and support three attributes Best explanation: A record is encoded by encoding the values of its fields in the order that they are declared. |
|
| 208. |
Which of the following interface is implemented by Sqoop for recording?(a) SqoopWrite(b) SqoopRecord(c) SqoopRead(d) None of the mentioned |
|
Answer» Right option is (b) SqoopRecord The explanation is: Class SqoopRecord is an interface implemented by the classes generated by sqoop orm.ClassWriter. |
|
| 209. |
Records are terminated by a __________ character.(a) RECORD_DELIMITER(b) FIELD_DELIMITER(c) FIELD_LIMITER(d) None of the mentioned |
|
Answer» Correct option is (a) RECORD_DELIMITER Best explanation: Class RecordParser parses a record containing one or more fields. |
|
| 210. |
_________ supports null values for all types.(a) SmallObjectLoader(b) FieldMapProcessor(c) DelimiterSet(d) JdbcWritableBridge |
|
Answer» Correct option is (d) JdbcWritableBridge The explanation: JdbcWritableBridge class contains a set of methods which can read db columns from a ResultSet into Java types. |
|
| 211. |
Which of the following class is used for general processing of error?(a) LargeObjectLoader(b) ProcessingException(c) DelimiterSet(d) LobSerializer |
|
Answer» Correct option is (b) ProcessingException The explanation is: General error occurs during the processing of a SqoopRecord. |
|
| 212. |
Point out the correct statement.(a) ZooKeeper can achieve high throughput and high latency numbers(b) The fault tolerant ordering means that sophisticated synchronization primitives can be implemented at the client(c) The ZooKeeper implementation puts a premium on high performance, highly available, strictly ordered access(d) All of the mentioned |
|
Answer» Right option is (c) The ZooKeeper implementation puts a premium on high performance, highly available, strictly ordered access To explain I would say: The performance aspects of ZooKeeper means it can be used in large, distributed systems. |
|
| 213. |
Which of the following is a singleton instance class?(a) LargeObjectLoader(b) FieldMapProcessor(c) DelimiterSet(d) LobSerializer |
|
Answer» The correct choice is (a) LargeObjectLoader To explain I would say: Lifetime is limited to the current TaskInputOutputContext’s life. |
|
| 214. |
Which of the guarantee is provided by Zookeeper?(a) Interactivity(b) Flexibility(c) Scalability(d) Reliability |
|
Answer» Right option is (d) Reliability To explain I would say: Once an update has been applied, it will persist from that time forward until a client overwrites the update. |
|
| 215. |
__________ is used as a remote procedure call (RPC) framework for facebook.(a) Oozie(b) Mahout(c) Thrift(d) Impala |
|
Answer» The correct answer is (c) Thrift To explain: In contrast to built-in types, created data structures are sent as a result in generated code. |
|
| 216. |
The ________ class provides the getValue() method to read the values from its instance.(a) Get(b) Result(c) Put(d) Value |
|
Answer» Correct option is (b) Result To elaborate: Get the result by passing your Get class instance to the get method of the HTable class. This method returns the Result class object, which holds the requested result. |
|
| 217. |
A number of constants used in the client ZooKeeper API were renamed in order to reduce ________ collision.(a) value(b) namespace(c) counter(d) none of the mentioned |
|
Answer» Right choice is (b) namespace To explain: ZOOKEEPER-18 removed KeeperStateChanged, use KeeperStateDisconnected instead. |
|
| 218. |
ZooKeeper is especially fast in ___________ workloads.(a) write(b) read-dominant(c) read-write(d) none of the mentioned |
|
Answer» Correct answer is (b) read-dominant For explanation: ZooKeeper applications run on thousands of machines, and it performs best where reads are more common than writes, at ratios of around 10:1. |
|
| 219. |
Point out the correct statement.(a) Thrift is developed for scalable cross-language services development(b) Thrift includes a complete stack for creating clients and servers(c) The top part of the Thrift stack is generated code from the Thrift definition(d) All of the mentioned |
|
Answer» Correct choice is (d) All of the mentioned Explanation: The services generate from this file client and processor code. |
|
| 220. |
________ communicate with the client and handle data-related operations.(a) Master Server(b) Region Server(c) Htable(d) All of the mentioned |
|
Answer» The correct choice is (b) Region Server The explanation is: Region Server handle read and write requests for all the regions under it. |
|
| 221. |
The Email & Apps team of ___________ uses ZooKeeper to coordinate sharding and responsibility changes in a distributed email client.(a) Katta(b) Helprace(c) Rackspace(d) None of the mentioned |
|
Answer» Correct option is (c) Rackspace Easiest explanation: ZooKeeper also provides distributed locking for connections to prevent a cluster from overwhelming servers. |
|
| 222. |
Point out the correct statement.(a) With TextInputFormat and KeyValueTextInputFormat, each mapper receives a variable number of lines of input(b) StreamXmlRecordReader, the page elements can be interpreted as records for processing by a mapper(c) The number depends on the size of the split and the length of the lines.(d) All of the mentioned |
|
Answer» The correct option is (d) All of the mentioned The explanation: Large XML documents that are composed of a series of “records” can be broken into these records using simple string or regular-expression matching to find start and end tags of records. |
|
| 223. |
ZooKeeper is used for configuration, leader election in Cloud edition of ______________(a) Solr(b) Solur(c) Solar101(d) All of the mentioned |
|
Answer» Right choice is (a) Solr To explain: ZooKeeper is used for internal application development with Solr and Hadoop with Hbase. |
|
| 224. |
_________ is the main configuration file of HBase.(a) hbase.xml(b) hbase-site.xml(c) hbase-site-conf.xml(d) none of the mentioned |
|
Answer» The correct option is (b) hbase-site.xml The best I can explain: Set the data directory to an appropriate location by opening the HBase home folder in /usr/local/HBase. |
|
| 225. |
Which of the following project is interface definition language for hadoop?(a) Oozie(b) Mahout(c) Thrift(d) Impala |
|
Answer» Correct answer is (c) Thrift For explanation: Thrift is an interface definition language and binary communication protocol that is used to define and create services for numerous languages. |
|
| 226. |
The ______________ class defines a configuration parameter named LINES_PER_MAP that controls how the input file is split.(a) NLineInputFormat(b) InputLineFormat(c) LineInputFormat(d) None of the mentioned |
|
Answer» Right option is (a) NLineInputFormat The explanation: We can set the value of parameter via the Source interface’s inputConf method. |
|
| 227. |
___________ takes node and rack locality into account when deciding which blocks to place in the same split.(a) CombineFileOutputFormat(b) CombineFileInputFormat(c) TextFileInputFormat(d) None of the mentioned |
|
Answer» Right answer is (b) CombineFileInputFormat The explanation is: CombineFileInputFormat does not compromise the speed at which it can process the input in a typical MapReduce job. |
|
| 228. |
The split size is normally the size of a ________ block, which is appropriate for most applications.(a) Generic(b) Task(c) Library(d) HDFS |
|
Answer» Correct option is (d) HDFS For explanation: FileInputFormat splits only large files(Here “large” means larger than an HDFS block). |
|
| 229. |
Point out the correct statement.(a) The minimum split size is usually 1 byte, although some formats have a lower bound on the split size(b) Applications may impose a minimum split size(c) The maximum split size defaults to the maximum value that can be represented by a Java long type(d) All of the mentioned |
|
Answer» The correct choice is (a) The minimum split size is usually 1 byte, although some formats have a lower bound on the split size For explanation: The maximum split size has an effect only when it is less than the block size, forcing splits to be smaller than a block. |
|
| 230. |
Point out the wrong statement.(a) Oozie v2 is a server based Coordinator Engine specialized in running workflows based on time and data triggers(b) Oozie v1 is a server based Workflow Engine specialized in running workflow jobs with actions that execute Hadoop Map/Reduce and Pig jobs(c) A Workflow application is DAG that coordinates the following types of actions(d) None of the mentioned |
|
Answer» The correct answer is (d) None of the mentioned Easiest explanation: Cycle in workflows are not supported. |
|
| 231. |
Oozie can make _________ callback notifications on action start events and workflow end events.(a) TCP(b) HTTP(c) IP(d) All of the mentioned |
|
Answer» Correct answer is (b) HTTP Explanation: In the case of an action start failure in a workflow job, depending on the type of failure, Oozie will attempt automatic retries, it will request a manual retry or it will fail the workflow job. |
|
| 232. |
Which of the following is one of the possible state for a workflow jobs?(a) PREP(b) START(c) RESUME(d) END |
|
Answer» Correct answer is (a) PREP Easy explanation: Possible states for a workflow jobs are: PREP, RUNNING, SUSPENDED, SUCCEEDED, KILLED and FAILED. |
|
| 233. |
Oozie Workflow jobs are Directed ________ graphs of actions.(a) Acyclical(b) Cyclical(c) Elliptical(d) All of the mentioned |
|
Answer» The correct option is (a) Acyclical To elaborate: Oozie is a framework allowing to combine multiple Map/Reduce jobs into a logical unit of work. |
|
| 234. |
Which of the following has the core Eclipse PDE tools for HDT development?(a) RVP(b) RAP(c) RBP(d) RVP |
|
Answer» Right option is (b) RAP Easiest explanation: RCP/RAP developers package has the core Eclipse PDE tools. |
|
| 235. |
HDT is used for listing running Jobs on __________ Cluster.(a) MR(b) Hive(c) Pig(d) None of the mentioned |
|
Answer» Right answer is (a) MR To explain: HDT can be used for launching Mapreduce programs on a Hadoop cluster. |
|
| 236. |
Which of the following tool is intended to be more compatible with HDT?(a) Git(b) Juno(c) Indigo(d) None of the mentioned |
|
Answer» Correct answer is (c) Indigo The explanation: The HDT uses a git repository, which anyone is free to checkout. |
|
| 237. |
Point out the wrong statement.(a) There is support for creating Hadoop project in HDT(b) HDT aims at bringing plugins in eclipse to simplify development on Hadoop platform(c) HDT is based on eclipse plugin architecture and can possibly support other versions like 0.23, CDH4 etc in next releases(d) None of the mentioned |
|
Answer» The correct answer is (d) None of the mentioned Explanation: HDT aims to simplify the Hadoop platform for developers. |
|
| 238. |
HDT has been tested on __________ and Juno, and can work on Kepler as well.(a) Rainbow(b) Indigo(c) Indiavo(d) Hadovo |
|
Answer» The correct answer is (b) Indigo For explanation: HDT aims at bringing plugins in eclipse to simplify development on Hadoop platform. |
|
| 239. |
Which of the following platforms does Hadoop run on?(a) Bare metal(b) Debian(c) Cross-platform(d) Unix-like |
|
Answer» Correct option is (c) Cross-platform The best explanation: Hadoop has support for cross-platform operating system. |
|
| 240. |
What was Hadoop written in?(a) Java (software platform)(b) Perl(c) Java (programming language)(d) Lua (programming language) |
|
Answer» Correct answer is (c) Java (programming language) The best I can explain: The Hadoop framework itself is mostly written in the Java programming language, with some native code in C and command-line utilities written as shell scripts. |
|
| 241. |
Which of the following genres does Hadoop produce?(a) Distributed file system(b) JAX-RS(c) Java Message Service(d) Relational Database Management System |
|
Answer» Right answer is (a) Distributed file system For explanation I would say: The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to the user. |
|
| 242. |
Point out the wrong statement.(a) Hadoop works better with a small number of large files than a large number of small files(b) CombineFileInputFormat is designed to work well with small files(c) CombineFileInputFormat does not compromise the speed at which it can process the input in a typical MapReduce job(d) None of the mentioned |
|
Answer» The correct option is (c) CombineFileInputFormat does not compromise the speed at which it can process the input in a typical MapReduce job Easy explanation: If the file is very small (“small” means significantly smaller than an HDFS block) and there are a lot of them, then each map task will process very little input, and there will be a lot of them (one per file), each of which imposes extra bookkeeping overhead. |
|
| 243. |
Point out the correct statement.(a) Avro provides functionality similar to systems such as Thrift(b) When Avro is used in RPC, the client and server exchange data in the connection handshake(c) Apache Avro, Avro, Apache, and the Avro and Apache logos are trademarks of The Java Foundation(d) None of the mentioned |
|
Answer» The correct answer is (a) Avro provides functionality similar to systems such as Thrift Easy explanation: Avro differs from these systems in the fundamental aspects like untagged data. |
|
| 244. |
Avro supports ______ kinds of complex types.(a) 3(b) 4(c) 6(d) 7 |
|
Answer» Correct option is (d) 7 To elaborate: Avro supports six kinds of complex types: records, enums, arrays, maps, unions and fixed. |
|
| 245. |
Avro schemas are defined with _____(a) JSON(b) XML(c) JAVA(d) All of the mentioned |
|
Answer» Right choice is (a) JSON For explanation I would say: JSON implementation facilitates implementation in languages that already have JSON libraries. |
|
| 246. |
Point out the correct statement.(a) Avro Fixed type should be defined in Hive as lists of tiny ints(b) Avro Bytes type should be defined in Hive as lists of tiny ints(c) Avro Enum type should be defined in Hive as strings(d) All of the mentioned |
|
Answer» Right answer is (b) Avro Bytes type should be defined in Hive as lists of tiny ints Easiest explanation: The AvroSerde will convert these to Bytes during the saving process. |
|
| 247. |
Point out the wrong statement.(a) Java code is used to deserialize the contents of the file into objects(b) Avro allows you to use complex data structures within Hadoop MapReduce jobs(c) The m2e plugin automatically downloads the newly added JAR files and their dependencies(d) None of the mentioned |
|
Answer» The correct answer is (d) None of the mentioned Easiest explanation: A unit test is useful because you can make assertions to verify that the values of the deserialized object are the same as the original values. |
|
| 248. |
Which hdfs command is used to check for various inconsistencies?(a) fsk(b) fsck(c) fetchdt(d) none of the mentioned |
|
Answer» Right choice is (b) fsck The best I can explain: fsck is designed for reporting problems with various files, for example, missing blocks for a file or under-replicated blocks. |
|
| 249. |
Which of the following is a common hadoop maintenance issue?(a) Lack of tools(b) Lack of configuration management(c) Lack of web interface(d) None of the mentioned |
|
Answer» The correct option is (b) Lack of configuration management Best explanation: Without a centralized configuration management framework, you end up with a number of issues that can cascade just as your usage picks up. |
|
| 250. |
Point out the wrong statement.(a) To create an Avro-backed table, specify the serde as org.apache.hadoop.hive.serde2.avro.AvroSerDe(b) Avro-backed tables can be created in Hive using AvroSerDe(c) The AvroSerde cannot serialize any Hive table to Avro files(d) None of the mentioned |
|
Answer» The correct answer is (c) The AvroSerde cannot serialize any Hive table to Avro files The best I can explain: The AvroSerde can serialize any Hive table to Avro files. |
|