InterviewSolution
This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.
| 101. |
How many types of modes are present in Hama?(a) 2(b) 3(c) 4(d) 5This question was posed to me during an interview.This intriguing question originated from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct choice is (b) 3 |
|
| 102. |
SolrJ now has first class support for __________ API.(a) Compactions(b) Collections(c) Distribution(d) All of the mentionedThe question was posed to me in an interview for internship.My question is from Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT option is (b) Collections Explanation: SOLR is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene. |
|
| 103. |
New ____________ type enables Indexing and searching of date ranges, particularly multi-valued ones.(a) RangeField(b) DateField(c) DateRangeField(d) All of the mentionedThe question was asked by my college director while I was bunking the class.This key question is from Lucene with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct answer is (c) DateRangeField |
|
| 104. |
PostingsFormat now uses a __________ API when writing postings, just like doc values.(a) push(b) pull(c) read(d) all of the mentionedI have been asked this question in an international level competition.My question is taken from Lucene with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right ANSWER is (b) pull |
|
| 105. |
Heap usage during IndexWriter merging is also much lower with the new _________(a) LucCodec(b) Lucene50Codec(c) Lucene20Cod(d) All of the mentionedThis question was addressed to me during a job interview.My question is from Lucene with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» CORRECT answer is (b) Lucene50Codec For explanation: Doc VALUES and NORMS for the SEGMENTS being merged are no longer fully loaded into heap for all FIELDS |
|
| 106. |
Point out the wrong statement.(a) ConcurScheduler detects whether the index is on SSD or not(b) Memory index supports payloads(c) Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate limit IO writes for each merge depending on incoming merge rate(d) The default codec has an option to control BEST_SPEED or BEST_COMPRESSION for stored fieldsThis question was posed to me during an interview.My doubt is from Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT choice is (a) ConcurScheduler detects whether the index is on SSD or not To explain: CONCURRENTMERGESCHEDULER does a better JOB DEFAULTING its SETTINGS. |
|
| 107. |
During merging, __________ now always checks the incoming segments for corruption before merging.(a) LocalWriter(b) IndexWriter(c) ReadWriter(d) All of the mentionedThe question was asked in my homework.My question comes from Lucene with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct ANSWER is (B) IndexWriter |
|
| 108. |
The Lucene _________ is pleased to announce the availability of Apache Lucene 5.0.0 and Apache Solr 5.0.0.(a) PMC(b) RPC(c) CPM(d) All of the mentionedThis question was posed to me in an online interview.The query is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT option is (a) PMC Easiest EXPLANATION: PyLucene was PREVIOUSLY hosted at the Open SOURCE Applications FOUNDATION. |
|
| 109. |
Point out the correct statement.(a) Every Lucene segment now stores a unique id per-segment and per-commit to aid in accurate replication of index files(b) The default norms format now uses sparse encoding when appropriate(c) Tokenizers and Analyzers no longer require Reader on init(d) All of the mentionedI have been asked this question in quiz.This intriguing question originated from Lucene with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The CORRECT ANSWER is (d) All of the mentioned |
|
| 110. |
All file access uses Java’s __________ APIs which give Lucene stronger index safety.(a) NIO.2(b) NIO.3(c) NIO.4(d) NIO.5The question was asked in a job interview.I'd like to ask this question from Lucene with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct OPTION is (a) NIO.2 |
|
| 111. |
Lucene provides scalable, high-Performance indexing over ______per hour on modern hardware.(a) 1 TB(b) 150GB(c) 10 GB(d) None of the mentionedThis question was posed to me in an internship interview.Question is from Lucene with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right choice is (B) 150GB |
|
| 112. |
___________ is a technology suitable for nearly any application that requires full-text search, especially cross-platform.(a) Lucene(b) Oozie(c) Lucy(d) All of the mentionedThis question was addressed to me in unit test.I'm obligated to ask this question of Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct choice is (a) LUCENE |
|
| 113. |
Point out the wrong statement.(a) PyLucene is a Lucene port(b) PyLucene embeds a Java VM with Lucene into a Python process(c) The PyLucene Python extension, a Python module called lucene is machine-generated by JCC(d) PyLucene is built with JCCThis question was posed to me during an internship interview.This question is from Lucene with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct option is (a) PyLucene is a Lucene PORT |
|
| 114. |
_______ is a Python port of the Core project.(a) Solr(b) Lucene Core(c) Lucy(d) PyLuceneThis question was posed to me in my homework.Asked question is from Lucene with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The CORRECT ANSWER is (d) PyLucene |
|
| 115. |
____________ is a subproject with the aim of collecting and distributing free materials.(a) OSR(b) OPR(c) ORP(d) ORSThis question was posed to me in an interview for job.This question is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The CORRECT answer is (C) ORP |
|
| 116. |
Point out the correct statement.(a) Building PyLucene requires GNU Make, a recent version of Ant capable of building Java Lucene and a C++ compiler(b) PyLucene is supported on Mac OS X, Linux, Solaris and Windows(c) Use of setuptools is recommended for Lucene(d) All of the mentionedI have been asked this question in an interview for job.Asked question is from Lucene with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The CORRECT option is (d) All of the mentioned |
|
| 117. |
___________ provides Java-based indexing and search technology.(a) Solr(b) Lucene Core(c) Lucy(d) All of the mentionedI got this question by my school principal while I was bunking the class.Origin of the question is Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct ANSWER is (b) Lucene Core |
|
| 118. |
____________ sink can be a text file, the console display, a simple HDFS path, or a null bucket where the data is simply deleted.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) None of the mentionedI have been asked this question in semester exam.This intriguing question comes from Flume with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right OPTION is (c) Basic |
|
| 119. |
___________ is a high performance search server built using Lucene Core.(a) Solr(b) Lucene Core(c) Lucy(d) PyLuceneThis question was addressed to me in exam.The above asked question is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right ANSWER is (a) SOLR |
|
| 120. |
___________ is where you would land a flow (or possibly multiple flows joined together) into an HDFS-formatted file system.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) All of the mentionedI have been asked this question during an interview.My enquiry is from Flume with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct choice is (a) Collector TIER Event |
|
| 121. |
Point out the wrong statement.(a) Version 1.4.0 is the fourth Flume release as an Apache top-level project(b) Apache Flume 1.5.2 is a security and maintenance release that disables SSLv3 on all components in Flume that support SSL/TLS(c) Flume is backwards-compatible with previous versions of the Flume 1.x codeline(d) None of the mentionedI have been asked this question in an online quiz.The question is from Flume with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct ANSWER is (d) None of the mentioned |
|
| 122. |
A number of ____________ source adapters give you the granular control to grab a specific file.(a) multimedia file(b) text file(c) image file(d) none of the mentionedI have been asked this question in semester exam.I want to ask this question from Flume with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct OPTION is (B) TEXT file |
|
| 123. |
____________is used when you want the sink to be the input source for another operation.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) All of the mentionedThis question was posed to me in class test.Question is taken from Flume with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» CORRECT choice is (b) Agent Tier Event The best explanation: All AGENTS in a specific tier could be given the same name; One CONFIGURATION FILE with … Clients send EVENTS to Agents; Agents hosts number Flume components. |
|
| 124. |
A ____________ is an operation on the stream that can transform the stream.(a) Decorator(b) Source(c) Sinks(d) All of the mentionedThis question was addressed to me by my school principal while I was bunking the class.My query is from Flume with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct ANSWER is (b) Source |
|
| 125. |
___________ was created to allow you to flow data from a source into your Hadoop environment.(a) Imphala(b) Oozie(c) Flume(d) All of the mentionedI had been asked this question in exam.I need to ask this question from Flume with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT CHOICE is (c) FLUME Explanation: In Flume, the entities you work with are CALLED sources, DECORATORS, and sinks. |
|
| 126. |
Point out the correct statement.(a) Flume is a distributed, reliable, and available service(b) Version 1.5.2 is the eighth Flume release as an Apache top-level project(c) Flume 1.5.2 is production-ready software for integration with hadoop(d) All of the mentionedThis question was addressed to me during an interview.I want to ask this question from Flume with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct choice is (a) FLUME is a DISTRIBUTED, RELIABLE, and available service |
|
| 127. |
Spark is engineered from the bottom-up for performance, running ___________ faster than Hadoop by exploiting in memory computing and other optimizations.(a) 100x(b) 150x(c) 200x(d) None of the mentionedI had been asked this question during an interview for a job.The doubt is from Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT choice is (a) 100x The BEST EXPLANATION: Spark is fast on disk too; it currently holds the world record in LARGE scale on-disk sorting. |
|
| 128. |
Apache Flume 1.3.0 is the fourth release under the auspices of Apache of the so-called ________ codeline.(a) NG(b) ND(c) NF(d) NRI have been asked this question during an interview for a job.The doubt is from Flume with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT choice is (a) NG To elaborate: Flume 1.3.0 has been put through many stress and regression TESTS, is STABLE, production-ready software, and is backwards-compatible with Flume 1.2.0. |
|
| 129. |
Spark includes a collection over ________ operators for transforming data and familiar data frame APIs for manipulating semi-structured data.(a) 50(b) 60(c) 70(d) 80I got this question at a job interview.My doubt stems from Spark with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right OPTION is (d) 80 |
|
| 130. |
Spark is packaged with higher level libraries, including support for _________ queries.(a) SQL(b) C(c) C++(d) None of the mentionedI had been asked this question by my school teacher while I was bunking the class.The doubt is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right ANSWER is (a) SQL |
|
| 131. |
Which of the following language is not supported by Spark?(a) Java(b) Pascal(c) Scala(d) PythonThis question was addressed to me in my homework.The doubt is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Right option is (b) Pascal |
|
| 132. |
Point out the wrong statement.(a) Spark is intended to replace, the Hadoop stack(b) Spark was designed to read and write data from and to HDFS, as well as other storage systems(c) Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN(d) None of the mentionedI got this question in a national level competition.My question is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The CORRECT CHOICE is (a) Spark is INTENDED to replace, the Hadoop STACK |
|
| 133. |
Which of the following can be used to launch Spark jobs inside MapReduce?(a) SIM(b) SIMR(c) SIR(d) RISThis question was posed to me in semester exam.I would like to ask this question from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» RIGHT ANSWER is (b) SIMR For explanation: With SIMR, users can start experimenting with Spark and use its shell WITHIN a COUPLE of minutes after downloading it. |
|
| 134. |
Point out the correct statement.(a) Spark enables Apache Hive users to run their unmodified queries much faster(b) Spark interoperates only with Hadoop(c) Spark is a popular data warehouse solution running on top of Hadoop(d) None of the mentionedThis question was addressed to me during an interview.My question comes from Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct option is (a) Spark enables Apache Hive users to run their unmodified queries MUCH faster |
|
| 135. |
Spark runs on top of ___________ a cluster manager system which provides efficient resource isolation across distributed applications.(a) Mesjs(b) Mesos(c) Mesus(d) All of the mentionedThe question was asked in homework.This key question is from Spark with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct option is (b) Mesos |
|
| 136. |
GraphX provides an API for expressing graph computation that can model the __________ abstraction.(a) GaAdt(b) Spark Core(c) Pregel(d) None of the mentionedThe question was asked by my school principal while I was bunking the class.I want to ask this question from Spark with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct OPTION is (C) Pregel |
|
| 137. |
Users can easily run Spark on top of Amazon’s __________(a) Infosphere(b) EC2(c) EMR(d) None of the mentionedThis question was addressed to me in an interview.Query is from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct answer is (B) EC2 |
|
| 138. |
________ is a distributed graph processing framework on top of Spark.(a) MLlib(b) Spark Streaming(c) GraphX(d) All of the mentionedThis question was posed to me in semester exam.My question comes from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct choice is (C) GRAPHX |
|
| 139. |
____________ is a distributed machine learning framework on top of Spark.(a) MLlib(b) Spark Streaming(c) GraphX(d) RDDsI had been asked this question in an online quiz.The question is from Spark with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» CORRECT answer is (a) MLLIB Explanation: MLlib IMPLEMENTS many common machine learning and statistical ALGORITHMS to simplify large SCALE machine learning pipelines. |
|
| 140. |
Point out the wrong statement.(a) For distributed storage, Spark can interface with a wide variety, including Hadoop Distributed File System (HDFS)(b) Spark also supports a pseudo-distributed mode, usually used only for development or testing purposes(c) Spark has over 465 contributors in 2014(d) All of the mentionedThe question was asked in semester exam.My enquiry is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct answer is (d) All of the mentioned |
|
| 141. |
______________ leverages Spark Core fast scheduling capability to perform streaming analytics.(a) MLlib(b) Spark Streaming(c) GraphX(d) RDDsThis question was addressed to me in an interview for job.The origin of the question is Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct OPTION is (b) Spark STREAMING |
|
| 142. |
Spark SQL provides a domain-specific language to manipulate ___________ in Scala, Java, or Python.(a) Spark Streaming(b) Spark SQL(c) RDDs(d) All of the mentionedThe question was posed to me during a job interview.I need to ask this question from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct OPTION is (c) RDDs |
|
| 143. |
____________ is a component on top of Spark Core.(a) Spark Streaming(b) Spark SQL(c) RDDs(d) All of the mentionedThis question was posed to me in exam.My question is taken from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» The correct option is (b) SPARK SQL |
|
| 144. |
Spark was initially started by ____________ at UC Berkeley AMPLab in 2009.(a) Mahek Zaharia(b) Matei Zaharia(c) Doug Cutting(d) StonebrakerI got this question in an online interview.This interesting question is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» CORRECT choice is (B) Matei Zaharia The EXPLANATION: Apache Spark is an open-source CLUSTER computing framework originally developed in the AMPLab at UC Berkeley. |
|
| 145. |
Point out the correct statement.(a) RSS abstraction provides distributed task dispatching, scheduling, and basic I/O functionalities(b) For cluster manager, Spark supports standalone Hadoop YARN(c) Hive SQL is a component on top of Spark Core(d) None of the mentionedI had been asked this question during an interview for a job.The question is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop |
|
Answer» Correct CHOICE is (b) For cluster manager, SPARK SUPPORTS standalone HADOOP YARN |
|