145 + Interview Questions in Apache Spark in Hadoop Page 2 InterviewSolution

51.	Mahout provides an implementation of a ______________ identification algorithm which scores collocations using log-likelihood ratio.(a) collocation(b) compaction(c) collection(d) none of the mentionedI have been asked this question in an internship interview.I'd like to ask this question from Mahout with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct answer is (a) COLLOCATION The BEST I can explain: The log-likelihood SCORE INDICATES the relative usefulness of a collocation with regards other term combinations in the text.

Discussion

52.	_________ does not restrict contributions to Hadoop based implementations.(a) Mahout(b) Oozie(c) Impala(d) All of the mentionedThe question was asked during an online exam.This is a very interesting question from Mahout with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct choice is (a) Mahout To EXPLAIN I would say: Mahout is distributed under a commercially FRIENDLY APACHE Software LICENSE.

Discussion

53.	Point out the correct statement.(a) Mahout is distributed under a commercially friendly Apache Software license(b) Mahout is a library of scalable machine-learning algorithms, implemented on top of Apache Hadoop® and using the MapReduce paradigm(c) Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms(d) None of the mentionedI have been asked this question in an interview.This intriguing question originated from Mahout with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct choice is (d) None of the mentioned The explanation is: The GOAL of MAHOUT is to build a VIBRANT, responsive, diverse community to facilitate discussions not only on the project itself but ALSO on potential use CASES.

Discussion

54.	The output descriptor for the table to be written is created by calling ____________(a) OutputJobInfo.describe(b) OutputJobInfo.create(c) OutputJobInfo.put(d) None of the mentionedI got this question in semester exam.My doubt stems from HCatalog with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct CHOICE is (b) OutputJobInfo.create To explain I would say: The implementation of MAP takes HCatRecord as an input and the implementation of Reduce produces it as an output.

Discussion

55.	Mahout provides ____________ libraries for commonand primitive Java collections.(a) Java(b) Javascript(c) Perl(d) PythonI have been asked this question by my school teacher while I was bunking the class.My question comes from Mahout with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (a) Java The EXPLANATION: Maths operations are FOCUSED on linear algebra and STATISTICS.

Discussion

56.	Mahout provides ____________ libraries for commonand primitive Java collections.(a) Java(b) Javascript(c) Perl(d) PythonThe question was asked in an interview for job.My doubt is from Mahout with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer»

Discussion

57.	___________ is the type supported for storing values in HCatalog tables.(a) HCatRecord(b) HCatColumns(c) HCatValues(d) All of the mentionedI got this question by my college director while I was bunking the class.I want to ask this question from HCatalog with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct choice is (a) HCatRecord Easiest EXPLANATION: The types in an HCatalog table schema determine the types of OBJECTS RETURNED for DIFFERENT fields in HCatRecord.

Discussion

58.	The first call on the HCatOutputFormat must be ____________(a) setOutputSchema(b) setOutput(c) setOut(d) OutputSchemaI have been asked this question at a job interview.My enquiry is from HCatalog with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT ANSWER is (b) setOutput The explanation is: Any other call will THROW an exception saying the OUTPUT format is not INITIALIZED.

Discussion

59.	_______________ method is used to include a projection schema, to specify the output fields.(a) OutputSchema(b) setOut(c) setOutputSchema(d) none of the mentionedThe question was posed to me by my school principal while I was bunking the class.The origin of the question is HCatalog with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (C) setOutputSchema Easy EXPLANATION: If a schema is not SPECIFIED, all the columns in the table will be RETURNED.

Discussion

60.	Hive does not have a data type corresponding to the ____________ type in Pig.(a) decimal(b) short(c) biginteger(d) datetimeI had been asked this question during an online interview.My query is from HCatalog with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT answer is (c) biginteger To elaborate: Hive 0.12.0 and earlier releases SUPPORT WRITING Pig primitive DATA TYPES with HCatStorer.

Discussion

61.	Point out the wrong statement.(a) The Hive metastore lets you create tables without specifying a database(b) Restrictions apply to the types of columns HCatLoader can read from HCatalog-managed tables(c) If the table is partitioned, you can indicate which partitions to scan by immediately following the load statement with a partition filter statement(d) None of the mentionedI got this question by my college professor while I was bunking the class.I need to ask this question from HCatalog with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right answer is (d) None of the mentioned The explanation is: If you CREATED TABLES USING METASTORE, then the database name is ‘default’ and is not required when specifying the table for HCatLoader.

Discussion

62.	____________ is used with Pig scripts to write data to HCatalog-managed tables.(a) HamaStorer(b) HCatStam(c) HCatStorer(d) All of the mentionedI had been asked this question in a national level competition.My question comes from HCatalog with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT option is (C) HCatStorer Explanation: HCatStorer is accessed VIA a Pig store STATEMENT.

Discussion

63.	Point out the correct statement.(a) The HCatLoader and HCatStorer interfaces are used with Pig scripts to read and write data in HCatalog-managed tables(b) HCatalog is not thread safe(c) HCatLoader is used with Pig scripts to read data from HCatalog-managed tables.(d) All of the mentionedI have been asked this question in an online interview.My question is from HCatalog with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct OPTION is (d) All of the mentioned To ELABORATE: HCatLoader is accessed VIA a PIG load STATEMENT.

Discussion

64.	_________________ property allow users to override the expiry time specified.(a) hcat.desired.partition.num.splits(b) hcatalog.hive.client.cache.expiry.time(c) hcatalog.hive.client.cache.disabled(d) hcat.append.limitI have been asked this question in an interview.My question comes from HCatalog with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct ANSWER is (b) hcatalog.hive.client.cache.expiry.time The best EXPLANATION: This PROPERTY is an int, and SPECIFIES number of seconds.

Discussion

65.	On the write side, it is expected that the user pass in valid _________ with data correctly.(a) HRecords(b) HCatRecos(c) HCatRecords(d) None of the mentionedI got this question in final exam.This question is from HCatalog with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right answer is (c) HCatRecords To elaborate: In some cases where a user of HCAT (such as some older versions of PIG) does not support all the DATATYPES supported by HIVE, there are a few config PARAMETERS provided to handle data promotions/conversions to allow them to read data through HCatalog.

Discussion

66.	HCatalog maintains a cache of _________ to talk to the metastore.(a) HiveServer(b) HiveClients(c) HCatClients(d) All of the mentionedI had been asked this question in homework.This is a very interesting question from HCatalog with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right option is (b) HiveClients Easy explanation: HCATALOG MANAGES a cache of 1 metastore client per thread, DEFAULTING to an expiry of 120 SECONDS.

Discussion

67.	___________property allows us to specify a custom dir location pattern for all the writes, and will interpolate each variable.(a) hcat.dynamic.partitioning.custom.pattern(b) hcat.append.limit(c) hcat.pig.storer.external.location(d) hcatalog.hive.client.cache.expiry.timeThis question was addressed to me in class test.I need to ask this question from HCatalog with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT ANSWER is (a) hcat.dynamic.partitioning.custom.pattern Best explanation: hcat.append.limit allows an HCATALOG USER to specify a custom append limit.

Discussion

68.	For ___________ partitioning jobs, simply specifying a custom directory is not good enough.(a) static(b) semi cluster(c) dynamic(d) all of the mentionedThis question was addressed to me in my homework.The doubt is from HCatalog with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct choice is (c) dynamic To EXPLAIN I WOULD say: Since it writes to multiple destinations, and thus, INSTEAD of a DIRECTORY specification, it requires a pattern specification.

Discussion

69.	Point out the wrong statement.(a) The original name of WebHCat was Templeton(b) Robert in client management uses Hive to analyze his clients’ results(c) With HCatalog, HCatalog cannot send a JMS message that data is available(d) All of the mentionedThis question was posed to me by my college director while I was bunking the class.This intriguing question originated from HCatalog with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (c) With HCatalog, HCatalog cannot SEND a JMS message that data is available Explanation: The Pig JOB can then be restarted after ANALYZING CLIENT.

Discussion

70.	With HCatalog _________ does not need to modify the table structure.(a) Partition(b) Columns(c) Robert(d) All of the mentionedI had been asked this question in homework.My question is taken from HCatalog with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct ANSWER is (c) ROBERT Best EXPLANATION: Without HCatalog, Robert MUST alter the table to add the required partition.

Discussion

71.	Sally in data processing uses __________ to cleanse and prepare the data.(a) Pig(b) Hive(c) HCatalog(d) ImpalaI have been asked this question during an interview.I would like to ask this question from HCatalog with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right OPTION is (a) Pig The best I can explain: Without HCatalog, Sally must be manually informed by Joe when data is AVAILABLE, or poll on HDFS.

Discussion

72.	Point out the correct statement.(a) There is no guaranteed read consistency when a partition is dropped(b) Unpartitioned tables effectively have one default partition that must be created at table creation time(c) Once a partition is created, records cannot be added to it, removed from it, or updated in it(d) All of the mentionedThis question was addressed to me in examination.This question is from HCatalog with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct OPTION is (d) All of the mentioned Easiest explanation: PARTITIONED TABLES have no PARTITIONS at CREATE time.

Discussion

73.	__________ is a REST API for HCatalog.(a) WebHCat(b) WbHCat(c) InpHCat(d) None of the mentionedI got this question by my college director while I was bunking the class.Enquiry is from HCatalog with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right CHOICE is (a) WebHCat To explain: REST stands for “representational STATE transfer”, a style of API BASED on HTTP VERBS.

Discussion

74.	You can write to a single partition by specifying the partition key(s) and value(s) in the ___________ method.(a) setOutput(b) setOut(c) put(d) getThe question was asked during an interview.This intriguing question comes from Introduction to Hcatalog topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct answer is (a) setOutput The explanation is: You can WRITE to MULTIPLE partitions if the partition KEY(s) are columns in the DATA being stored.

Discussion

75.	The HCatalog __________ supports all Hive DDL that does not require MapReduce to execute.(a) Powershell(b) CLI(c) CMD(d) All of the mentionedThe question was posed to me at a job interview.This interesting question is from Introduction to Hcatalog topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT option is (b) CLI Easiest explanation: DATA is defined using HCatalog’s command LINE INTERFACE (CLI).

Discussion

76.	_____________ accepts a table to read data from and optionally a selection predicate to indicate which partitions to scan.(a) HCatOutputFormat(b) HCatInputFormat(c) OutputFormat(d) InputFormatThis question was posed to me during an online interview.I'm obligated to ask this question of Introduction to Hcatalog in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct option is (b) HCATINPUTFORMAT The EXPLANATION: The HCATALOG interface for MapReduce — HCatInputFormat and HCatOutputFormat — is an implementation of Hadoop INPUTFORMAT and OUTPUTFORMAT.

Discussion

77.	The HCatalog interface for Pig consists of ____________ and HCatStorer, which implement the Pig load and store interfaces respectively.(a) HCLoader(b) HCatLoader(c) HCatLoad(d) None of the mentionedThis question was posed to me in examination.This key question is from Introduction to Hcatalog topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (b) HCatLoader To explain I would SAY: HCatLoader accepts a table to read DATA from; you can indicate which PARTITIONS to scan by immediately following the LOAD statement with a partition filter statement.

Discussion

78.	Point out the wrong statement.(a) HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools(b) There is Hive-specific interface for HCatalog(c) Data is defined using HCatalog’s command line interface (CLI)(d) All of the mentionedThe question was posed to me in an interview for internship.This interesting question is from Introduction to Hcatalog in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right option is (B) There is Hive-specific INTERFACE for HCATALOG To elaborate: Since HCatalog uses Hive’s metastore, Hive can read DATA in HCatalog directly.

Discussion

79.	HCatalog is built on top of the Hive metastore and incorporates Hive’s is ____________(a) DDL(b) DML(c) TCL(d) DCLThis question was posed to me in unit test.My query is from Introduction to Hcatalog in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT ANSWER is (a) DDL Explanation: HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s COMMAND LINE interface for issuing data definition and metadata EXPLORATION commands.

Discussion

80.	Hive version ___________ is the first release that includes HCatalog.(a) 0.10.0(b) 0.11.0(c) 0.12.0(d) All of the mentionedThis question was addressed to me in final exam.My query is from Introduction to Hcatalog in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT choice is (B) 0.11.0 To EXPLAIN I WOULD say: HCatalog graduated from the Apache incubator and merged with the Hive project on March 26, 2013.

Discussion

81.	Point out the correct statement.(a) HCat provides connectors for MapReduce(b) Apache HCatalog provides table data access for CDH components such as Pig and MapReduce(c) HCat makes Hive metadata available to users of other Hadoop tools like Pig, MapReduce and Hive(d) All of the mentionedI had been asked this question during a job interview.Asked question is from Introduction to Hcatalog topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right option is (b) APACHE HCatalog provides table DATA access for CDH components such as Pig and MapReduce Explanation: Table definitions are MAINTAINED in the Hive METASTORE.

Discussion

82.	HCatalog supports reading and writing files in any format for which a ________ can be written.(a) SerDE(b) SaerDear(c) DocSear(d) All of the mentionedThis question was addressed to me in an interview for internship.I want to ask this question from Introduction to Hcatalog topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct answer is (a) SERDE Easiest explanation: By default, HCATALOG supports RCFile, CSV, JSON, and SequenceFile, and ORC file FORMATS. To use a CUSTOM format, you must provide the InputFormat, OutputFormat, and SerDe.

Discussion

83.	A ________ is used to manage the efficient barrier synchronization of the BSPPeers.(a) GroomServers(b) BSPMaster(c) Zookeeper(d) None of the mentionedI had been asked this question in homework.Question is taken from Hama with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right ANSWER is (c) Zookeeper Explanation: A GROOM SERVER is a process that performs bsp tasks ASSIGNED by BSPMaster.

Discussion

84.	________ is responsible for maintaining groom server status.(a) GroomServers(b) BSPMaster(c) Zookeeper(d) All of the mentionedThe question was posed to me by my college professor while I was bunking the class.This question is from Hama with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT ANSWER is (b) BSPMaster Explanation: A BSP MASTER and multiple GROOMS are started by the script.

Discussion

85.	A __________ server and a data node should be run on one physical node.(a) groom(b) web(c) client(d) all of the mentionedI have been asked this question in an online interview.My query is from Hama with Hadoop topic in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT OPTION is (a) groom The explanation is: Each groom is designed to RUN with HDFS or other distributed STORAGE.

Discussion

86.	Hama consist of mainly ________ components for large scale processing of graphs.(a) two(b) three(c) four(d) fiveThe question was asked by my college professor while I was bunking the class.Origin of the question is Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct OPTION is (B) THREE Explanation: Hama consists of three major components: BSPMaster, GROOMSERVERS and Zookeeper.

Discussion

87.	Point out the wrong statement.(a) The major difference between Hadoop and Hama is map/reduce tasks can’t communicate with each other(b) Hama follows master/slave pattern(c) A JobTracker maps to a BSPMaster, TaskTracker maps to a GroomServer and Map/Reduce task maps to a BSPTask(d) All of the mentionedThis question was posed to me in homework.My doubt stems from Hama with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT CHOICE is (d) All of the mentioned For EXPLANATION: BSPTASK can COMMUNICATE to each other.

Discussion

88.	Hama requires JRE _______ or higher and ssh to be set up between nodes in the cluster.(a) 1.6(b) 1.7(c) 1.8(d) 2.0I have been asked this question during an interview.My question is from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct ANSWER is (a) 1.6 The best explanation: APACHE Hama RELEASES are available under the Apache License, VERSION 2.0.

Discussion

89.	Hama was inspired by Google’s _________ large-scale graph computing framework.(a) Pragmatic(b) Pregel(c) Preghad(d) All of the mentionedThis question was addressed to me in a job interview.The doubt is from Hama with Hadoop topic in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct option is (B) Pregel The BEST I can explain: Hama is a distributed computing MODEL similar to MR (MAPREDUCE).

Discussion

90.	Point out the correct statement.(a) Apache Hama is a distributed computing framework based on Bulk Synchronous Parallel computing techniques for massive scientific computations(b) Hama is a Top Level Project under the Apache Software Foundation(c) BSP stands for Bulk Synchronous Parallel(d) All of the mentionedThe question was asked during an interview.This is a very interesting question from Hama with Hadoop in chapter Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT option is (d) All of the mentioned The explanation: HAMA is open SOURCE project for MANY OS.

Discussion

91.	Hama is a general ________________ computing engine on top of Hadoop.(a) BSP(b) ASP(c) MPP(d) None of the mentionedI have been asked this question in a job interview.This key question is from Hama with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct answer is (a) BSP The best I can explain: HAMA PROVIDES high-Performance computing engine for PERFORMING massive scientific and iterative algorithms on EXISTING OPEN source or enterprise Hadoop cluster.

Discussion

92.	A __________ in a social graph is a group of people who interact frequently with each other and less frequently with others.(a) semi-cluster(b) partial cluster(c) full cluster(d) none of the mentionedThis question was addressed to me during an internship interview.I want to ask this question from Hama with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT OPTION is (a) semi-cluster To EXPLAIN: semi-cluster is DIFFERENT from ordinary clustering in the sense that a vertex may belong to more than one semi-cluster.

Discussion

93.	Apache Hama provides complete clone of _________(a) Pragmatic(b) Pregel(c) ServePreg(d) All of the mentionedI had been asked this question in an online quiz.This interesting question is from Hama with Hadoop in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT choice is (b) Pregel Easy EXPLANATION: Pregel is USED for large processing of graphs.

Discussion

94.	The web UI provides information about ________ job statistics of the Hama cluster.(a) MPP(b) BSP(c) USP(d) ISPThis question was addressed to me in examination.This intriguing question comes from Hama with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Right CHOICE is (b) BSP To ELABORATE: Running/completed/Failed JOBS is detailed in UI interface.

Discussion

95.	Distributed Mode are mapped in the __________ file.(a) groomservers(b) grervers(c) grsvers(d) groomThis question was addressed to me in quiz.Question is taken from Hama with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct CHOICE is (a) groomservers To EXPLAIN I WOULD say: DISTRIBUTED MODE is used when you have multiple machines.

Discussion

96.	Point out the wrong statement.(a) Apache Hama is not a pure Bulk Synchronous Parallel Engine(b) Hama uses the Hadoop Core for RPC calls(c) Apache Hama is optimized for massive scientific computations such as matrix, graph and network algorithms(d) Hama is a relatively newer project than HadoopThis question was posed to me in my homework.My enquiry is from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» RIGHT answer is (a) APACHE Hama is not a pure Bulk Synchronous Parallel Engine Easiest explanation: Apache Hama is not a pure BSP.

Discussion

97.	_________ mode is used when you just have a single server and want to launch all the daemon processes.(a) Local Mode(b) Pseudo Distributed Mode(c) Distributed Mode(d) All of the mentionedI had been asked this question during an internship interview.I want to ask this question from Hama with Hadoop topic in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» CORRECT CHOICE is (b) Pseudo DISTRIBUTED Mode Easy EXPLANATION: Pseudo Distributed Mode can be CONFIGURED when you set the bsp.master.address to a host address.

Discussion

98.	__________ is the default mode if you download Hama.(a) Local Mode(b) Pseudo Distributed Mode(c) Distributed Mode(d) All of the mentionedThis question was posed to me by my college professor while I was bunking the class.My enquiry is from Hama with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The CORRECT option is (a) LOCAL Mode The explanation is: This mode can be CONFIGURED VIA the bsp.master.address PROPERTY to local.

Discussion

99.	Point out the correct statement.(a) In local mode, nothing must be launched via the start scripts(b) Distributed Mode is just like the “Pseudo Distributed Mode”(c) Apache Hama is one of the under-hyped projects in the Hadoop ecosystem(d) All of the mentionedThe question was asked in semester exam.My enquiry is from Hama with Hadoop in section Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» The correct choice is (b) Distributed MODE is just like the “Pseudo Distributed Mode” The EXPLANATION: You can adjust the number of threads used in this utility by SETTING the bsp.local.tasks.maximum property.

Discussion

100.	____________ Collection APIallows for even distribution of custom replica properties.(a) BALANUNIQUE(b) BALANCESHARDUNIQUE(c) BALANCEUNIQUE(d) None of the mentionedI got this question by my school principal while I was bunking the class.Asked question is from Lucene with Hadoop topic in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop
Answer» Correct answer is (b) BALANCESHARDUNIQUE Best explanation: Solr POWERS the search and navigation FEATURES of MANY of the world’s LARGEST INTERNET sites.

Discussion

Explore topic-wise InterviewSolutions in .

HCatalog is built on top of the Hive metastore and incorporates Hive’s is ____________(a) DDL(b) DML(c) TCL(d) DCLThis question was posed to me in unit test.My query is from Introduction to Hcatalog in portion Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop

Distributed Mode are mapped in the __________ file.(a) groomservers(b) grervers(c) grsvers(d) groomThis question was addressed to me in quiz.Question is taken from Hama with Hadoop in division Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift of Hadoop