145 + Interview Questions in Apache Spark in Hadoop Page 3 InterviewSolution

101.	How many types of modes are present in Hama?(a) 2(b) 3(c) 4(d) 5This question was posed to me during an interview.This intriguing question originated from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (b) 3 Explanation: Just LIKE Hadoop, HAMA has DISTINCT between THREE modes.

Discussion

102.	SolrJ now has first class support for __________ API.(a) Compactions(b) Collections(c) Distribution(d) All of the mentionedThe question was posed to me in an interview for internship.My question is from Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT option is (b) Collections Explanation: SOLR is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene.

Discussion

103.	New ____________ type enables Indexing and searching of date ranges, particularly multi-valued ones.(a) RangeField(b) DateField(c) DateRangeField(d) All of the mentionedThe question was asked by my college director while I was bunking the class.This key question is from Lucene with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct answer is (c) DateRangeField Explanation: A NEW ExitableDirectoryReader EXTENDS FilterDirectoryReader and enables exiting REQUESTS that TAKE too long to enumerate over TERMS.

Discussion

104.	PostingsFormat now uses a __________ API when writing postings, just like doc values.(a) push(b) pull(c) read(d) all of the mentionedI have been asked this question in an international level competition.My question is taken from Lucene with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right ANSWER is (b) pull To explain: This is powerful because you can do things in your postings format that REQUIRE MAKING more than ONE PASS through the postings such as iterating over all postings.

Discussion

105.	Heap usage during IndexWriter merging is also much lower with the new _________(a) LucCodec(b) Lucene50Codec(c) Lucene20Cod(d) All of the mentionedThis question was addressed to me during a job interview.My question is from Lucene with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT answer is (b) Lucene50Codec For explanation: Doc VALUES and NORMS for the SEGMENTS being merged are no longer fully loaded into heap for all FIELDS

Discussion

106.	Point out the wrong statement.(a) ConcurScheduler detects whether the index is on SSD or not(b) Memory index supports payloads(c) Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate limit IO writes for each merge depending on incoming merge rate(d) The default codec has an option to control BEST_SPEED or BEST_COMPRESSION for stored fieldsThis question was posed to me during an interview.My doubt is from Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT choice is (a) ConcurScheduler detects whether the index is on SSD or not To explain: CONCURRENTMERGESCHEDULER does a better JOB DEFAULTING its SETTINGS.

Discussion

107.	During merging, __________ now always checks the incoming segments for corruption before merging.(a) LocalWriter(b) IndexWriter(c) ReadWriter(d) All of the mentionedThe question was asked in my homework.My question comes from Lucene with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct ANSWER is (B) IndexWriter The EXPLANATION: Lucene SUPPORTS random-writable and advance-able SPARSE bitsets.

Discussion

108.	The Lucene _________ is pleased to announce the availability of Apache Lucene 5.0.0 and Apache Solr 5.0.0.(a) PMC(b) RPC(c) CPM(d) All of the mentionedThis question was posed to me in an online interview.The query is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT option is (a) PMC Easiest EXPLANATION: PyLucene was PREVIOUSLY hosted at the Open SOURCE Applications FOUNDATION.

Discussion

109.	Point out the correct statement.(a) Every Lucene segment now stores a unique id per-segment and per-commit to aid in accurate replication of index files(b) The default norms format now uses sparse encoding when appropriate(c) Tokenizers and Analyzers no longer require Reader on init(d) All of the mentionedI have been asked this question in quiz.This intriguing question originated from Lucene with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT ANSWER is (d) All of the mentioned To EXPLAIN: NormsFormat now GETS its own DEDICATED NormsConsumer/Producer.

Discussion

110.	All file access uses Java’s __________ APIs which give Lucene stronger index safety.(a) NIO.2(b) NIO.3(c) NIO.4(d) NIO.5The question was asked in a job interview.I'd like to ask this question from Lucene with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct OPTION is (a) NIO.2 Explanation: Index safety is provided in terms of better ERROR HANDLING and SAFER commits.

Discussion

111.	Lucene provides scalable, high-Performance indexing over ______per hour on modern hardware.(a) 1 TB(b) 150GB(c) 10 GB(d) None of the mentionedThis question was posed to me in an internship interview.Question is from Lucene with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right choice is (B) 150GB Easy explanation: LUCENE offers POWERFUL FEATURES through a simple API.

Discussion

112.	___________ is a technology suitable for nearly any application that requires full-text search, especially cross-platform.(a) Lucene(b) Oozie(c) Lucy(d) All of the mentionedThis question was addressed to me in unit test.I'm obligated to ask this question of Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (a) LUCENE The best explanation: APACHE Lucene is a high-performance, full-featured TEXT search ENGINE library written entirely in JAVA.

Discussion

113.	Point out the wrong statement.(a) PyLucene is a Lucene port(b) PyLucene embeds a Java VM with Lucene into a Python process(c) The PyLucene Python extension, a Python module called lucene is machine-generated by JCC(d) PyLucene is built with JCCThis question was posed to me during an internship interview.This question is from Lucene with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (a) PyLucene is a Lucene PORT The explanation is: PyLucene is not a Lucene port but a Python WRAPPER around JAVA Lucene.

Discussion

114.	_______ is a Python port of the Core project.(a) Solr(b) Lucene Core(c) Lucy(d) PyLuceneThis question was posed to me in my homework.Asked question is from Lucene with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT ANSWER is (d) PyLucene The explanation is: PyLucene is a Python EXTENSION for accessing Java LuceneTM.

Discussion

115.	____________ is a subproject with the aim of collecting and distributing free materials.(a) OSR(b) OPR(c) ORP(d) ORSThis question was posed to me in an interview for job.This question is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT answer is (C) ORP Easy explanation: Open Relevance PROJECT is used for relevance TESTING and PERFORMANCE.

Discussion

116.	Point out the correct statement.(a) Building PyLucene requires GNU Make, a recent version of Ant capable of building Java Lucene and a C++ compiler(b) PyLucene is supported on Mac OS X, Linux, Solaris and Windows(c) Use of setuptools is recommended for Lucene(d) All of the mentionedI have been asked this question in an interview for job.Asked question is from Lucene with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT option is (d) All of the mentioned Explanation: PYLUCENE requires Python version 2.X (x >= 3.5) and Java version 1.x (x &t;= 5).

Discussion

117.	___________ provides Java-based indexing and search technology.(a) Solr(b) Lucene Core(c) Lucy(d) All of the mentionedI got this question by my school principal while I was bunking the class.Origin of the question is Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct ANSWER is (b) Lucene Core The best I can explain: Lucene provides spellchecking, hit highlighting and ADVANCED analysis/tokenization capabilities.

Discussion

118.	____________ sink can be a text file, the console display, a simple HDFS path, or a null bucket where the data is simply deleted.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) None of the mentionedI have been asked this question in semester exam.This intriguing question comes from Flume with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right OPTION is (c) Basic Explanation: Flume will also ENSURE the INTEGRITY of the flow by sending BACK acknowledgments that data has actually arrived at the SINK.

Discussion

119.	___________ is a high performance search server built using Lucene Core.(a) Solr(b) Lucene Core(c) Lucy(d) PyLuceneThis question was addressed to me in exam.The above asked question is from Lucene with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right ANSWER is (a) SOLR To EXPLAIN: Solr provides HIT highlighting, faceted search, caching, replication, and a web admin interface.

Discussion

120.	___________ is where you would land a flow (or possibly multiple flows joined together) into an HDFS-formatted file system.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) All of the mentionedI have been asked this question during an interview.My enquiry is from Flume with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct choice is (a) Collector TIER Event For explanation I would say: A number of other predefined source ADAPTERS, as well as a COMMAND exit, allow you to use any executable command to FEED the FLOW of data.

Discussion

121.	Point out the wrong statement.(a) Version 1.4.0 is the fourth Flume release as an Apache top-level project(b) Apache Flume 1.5.2 is a security and maintenance release that disables SSLv3 on all components in Flume that support SSL/TLS(c) Flume is backwards-compatible with previous versions of the Flume 1.x codeline(d) None of the mentionedI have been asked this question in an online quiz.The question is from Flume with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct ANSWER is (d) None of the mentioned Easiest explanation: APACHE Flume 1.3.1 is a maintenance release for the 1.3.0 release, and INCLUDES several BUG FIXES and performance enhancements.

Discussion

122.	A number of ____________ source adapters give you the granular control to grab a specific file.(a) multimedia file(b) text file(c) image file(d) none of the mentionedI have been asked this question in semester exam.I want to ask this question from Flume with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct OPTION is (B) TEXT file The best I can explain: A number of predefined SOURCE adapters are BUILT into Flume.

Discussion

123.	____________is used when you want the sink to be the input source for another operation.(a) Collector Tier Event(b) Agent Tier Event(c) Basic(d) All of the mentionedThis question was posed to me in class test.Question is taken from Flume with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT choice is (b) Agent Tier Event The best explanation: All AGENTS in a specific tier could be given the same name; One CONFIGURATION FILE with … Clients send EVENTS to Agents; Agents hosts number Flume components.

Discussion

124.	A ____________ is an operation on the stream that can transform the stream.(a) Decorator(b) Source(c) Sinks(d) All of the mentionedThis question was addressed to me by my school principal while I was bunking the class.My query is from Flume with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct ANSWER is (b) Source Best explanation: A source can be any DATA source, and Flume has MANY predefined source ADAPTERS.

Discussion

125.	___________ was created to allow you to flow data from a source into your Hadoop environment.(a) Imphala(b) Oozie(c) Flume(d) All of the mentionedI had been asked this question in exam.I need to ask this question from Flume with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT CHOICE is (c) FLUME Explanation: In Flume, the entities you work with are CALLED sources, DECORATORS, and sinks.

Discussion

126.	Point out the correct statement.(a) Flume is a distributed, reliable, and available service(b) Version 1.5.2 is the eighth Flume release as an Apache top-level project(c) Flume 1.5.2 is production-ready software for integration with hadoop(d) All of the mentionedThis question was addressed to me during an interview.I want to ask this question from Flume with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (a) FLUME is a DISTRIBUTED, RELIABLE, and available service Explanation: Flume is used for efficiently collecting, aggregating, and MOVING large amounts of streaming event data.

Discussion

127.	Spark is engineered from the bottom-up for performance, running ___________ faster than Hadoop by exploiting in memory computing and other optimizations.(a) 100x(b) 150x(c) 200x(d) None of the mentionedI had been asked this question during an interview for a job.The doubt is from Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT choice is (a) 100x The BEST EXPLANATION: Spark is fast on disk too; it currently holds the world record in LARGE scale on-disk sorting.

Discussion

128.	Apache Flume 1.3.0 is the fourth release under the auspices of Apache of the so-called ________ codeline.(a) NG(b) ND(c) NF(d) NRI have been asked this question during an interview for a job.The doubt is from Flume with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT choice is (a) NG To elaborate: Flume 1.3.0 has been put through many stress and regression TESTS, is STABLE, production-ready software, and is backwards-compatible with Flume 1.2.0.

Discussion

129.	Spark includes a collection over ________ operators for transforming data and familiar data frame APIs for manipulating semi-structured data.(a) 50(b) 60(c) 70(d) 80I got this question at a job interview.My doubt stems from Spark with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right OPTION is (d) 80 For explanation I would say: Spark provides easy-to-use APIs for OPERATING on LARGE datasets.

Discussion

130.	Spark is packaged with higher level libraries, including support for _________ queries.(a) SQL(b) C(c) C++(d) None of the mentionedI had been asked this question by my school teacher while I was bunking the class.The doubt is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right ANSWER is (a) SQL The EXPLANATION: Standard libraries increase developer PRODUCTIVITY and can be seamlessly COMBINED to create complex WORKFLOWS.

Discussion

131.	Which of the following language is not supported by Spark?(a) Java(b) Pascal(c) Scala(d) PythonThis question was addressed to me in my homework.The doubt is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right option is (b) Pascal To elaborate: The Spark ENGINE runs in a variety of ENVIRONMENTS, from cloud SERVICES to Hadoop or MESOS clusters.

Discussion

132.	Point out the wrong statement.(a) Spark is intended to replace, the Hadoop stack(b) Spark was designed to read and write data from and to HDFS, as well as other storage systems(c) Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN(d) None of the mentionedI got this question in a national level competition.My question is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT CHOICE is (a) Spark is INTENDED to replace, the Hadoop STACK The best explanation: Spark is intended to ENHANCE, not replace, the Hadoop stack.

Discussion

133.	Which of the following can be used to launch Spark jobs inside MapReduce?(a) SIM(b) SIMR(c) SIR(d) RISThis question was posed to me in semester exam.I would like to ask this question from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT ANSWER is (b) SIMR For explanation: With SIMR, users can start experimenting with Spark and use its shell WITHIN a COUPLE of minutes after downloading it.

Discussion

134.	Point out the correct statement.(a) Spark enables Apache Hive users to run their unmodified queries much faster(b) Spark interoperates only with Hadoop(c) Spark is a popular data warehouse solution running on top of Hadoop(d) None of the mentionedThis question was addressed to me during an interview.My question comes from Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (a) Spark enables Apache Hive users to run their unmodified queries MUCH faster The explanation: Shark can ACCELERATE Hive queries by as much as 100x when the INPUT DATA fits into memory, and up 10x when the input data is STORED on disk.

Discussion

135.	Spark runs on top of ___________ a cluster manager system which provides efficient resource isolation across distributed applications.(a) Mesjs(b) Mesos(c) Mesus(d) All of the mentionedThe question was asked in homework.This key question is from Spark with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct option is (b) Mesos The best explanation: Mesos ENABLES fine grained sharing which allows a Spark JOB to DYNAMICALLY TAKE advantage of the idle resources in the cluster during its EXECUTION.

Discussion

136.	GraphX provides an API for expressing graph computation that can model the __________ abstraction.(a) GaAdt(b) Spark Core(c) Pregel(d) None of the mentionedThe question was asked by my school principal while I was bunking the class.I want to ask this question from Spark with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct OPTION is (C) Pregel To elaborate: GRAPHX is used for machine LEARNING.

Discussion

137.	Users can easily run Spark on top of Amazon’s __________(a) Infosphere(b) EC2(c) EMR(d) None of the mentionedThis question was addressed to me in an interview.Query is from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct answer is (B) EC2 To EXPLAIN I would say: Users can easily run SPARK (and Shark) on TOP of AMAZON’s EC2 either using the scripts that come with Spark.

Discussion

138.	________ is a distributed graph processing framework on top of Spark.(a) MLlib(b) Spark Streaming(c) GraphX(d) All of the mentionedThis question was posed to me in semester exam.My question comes from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (C) GRAPHX The explanation is: GraphX STARTED initially as a research project at UC Berkeley AMPLab and Databricks, and was LATER donated to the Spark project.

Discussion

139.	____________ is a distributed machine learning framework on top of Spark.(a) MLlib(b) Spark Streaming(c) GraphX(d) RDDsI had been asked this question in an online quiz.The question is from Spark with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT answer is (a) MLLIB Explanation: MLlib IMPLEMENTS many common machine learning and statistical ALGORITHMS to simplify large SCALE machine learning pipelines.

Discussion

140.	Point out the wrong statement.(a) For distributed storage, Spark can interface with a wide variety, including Hadoop Distributed File System (HDFS)(b) Spark also supports a pseudo-distributed mode, usually used only for development or testing purposes(c) Spark has over 465 contributors in 2014(d) All of the mentionedThe question was asked in semester exam.My enquiry is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct answer is (d) All of the mentioned The BEST explanation: SPARK is the most active project in the Apache Software Foundation and AMONG Big Data open SOURCE projects.

Discussion

141.	______________ leverages Spark Core fast scheduling capability to perform streaming analytics.(a) MLlib(b) Spark Streaming(c) GraphX(d) RDDsThis question was addressed to me in an interview for job.The origin of the question is Spark with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct OPTION is (b) Spark STREAMING Explanation: Spark Streaming ingests DATA in mini-batches and performs RDD TRANSFORMATIONS on those mini-batches of data.

Discussion

142.	Spark SQL provides a domain-specific language to manipulate ___________ in Scala, Java, or Python.(a) Spark Streaming(b) Spark SQL(c) RDDs(d) All of the mentionedThe question was posed to me during a job interview.I need to ask this question from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct OPTION is (c) RDDs Explanation: SPARK SQL provides SQL LANGUAGE support, with command-line interfaces and ODBC/JDBC server.

Discussion

143.	____________ is a component on top of Spark Core.(a) Spark Streaming(b) Spark SQL(c) RDDs(d) All of the mentionedThis question was posed to me in exam.My question is taken from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct option is (b) SPARK SQL Best explanation: Spark SQL introduces a NEW DATA ABSTRACTION called SchemaRDD, which provides support for structured and semi-structured data.

Discussion

144.	Spark was initially started by ____________ at UC Berkeley AMPLab in 2009.(a) Mahek Zaharia(b) Matei Zaharia(c) Doug Cutting(d) StonebrakerI got this question in an online interview.This interesting question is from Spark with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT choice is (B) Matei Zaharia The EXPLANATION: Apache Spark is an open-source CLUSTER computing framework originally developed in the AMPLab at UC Berkeley.

Discussion

145.	Point out the correct statement.(a) RSS abstraction provides distributed task dispatching, scheduling, and basic I/O functionalities(b) For cluster manager, Spark supports standalone Hadoop YARN(c) Hive SQL is a component on top of Spark Core(d) None of the mentionedI had been asked this question during an interview for a job.The question is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct CHOICE is (b) For cluster manager, SPARK SUPPORTS standalone HADOOP YARN Easy explanation: Spark REQUIRES a cluster manager and a distributed storage system.

Discussion

Explore topic-wise InterviewSolutions in .

How many types of modes are present in Hama?(a) 2(b) 3(c) 4(d) 5This question was posed to me during an interview.This intriguing question originated from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop

_______ is a Python port of the Core project.(a) Solr(b) Lucene Core(c) Lucy(d) PyLuceneThis question was posed to me in my homework.Asked question is from Lucene with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop

Which of the following language is not supported by Spark?(a) Java(b) Pascal(c) Scala(d) PythonThis question was addressed to me in my homework.The doubt is from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop

Users can easily run Spark on top of Amazon’s __________(a) Infosphere(b) EC2(c) EMR(d) None of the mentionedThis question was addressed to me in an interview.Query is from Spark with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop

____________ is a component on top of Spark Core.(a) Spark Streaming(b) Spark SQL(c) RDDs(d) All of the mentionedThis question was posed to me in exam.My question is taken from Spark with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop