Explore topic-wise InterviewSolutions in .

This section includes InterviewSolutions, each offering curated multiple-choice questions to sharpen your knowledge and support exam preparation. Choose a topic below to get started.

1.

DataStage originated at __________ a company that developed two notable products: UniVerse database and the DataStage ETL tool.(a) VMark(b) Vzen(c) Hatez(d) None of the mentionedThis question was addressed to me in a job interview.I'm obligated to ask this question of IBM InfoSphere topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT CHOICE is (a) VMARK

Explanation: The first VMark ETL prototype was BUILT by Lee Scheffler in the first half of 1996.
2.

InfoSphere ___________ provides you with the ability to flexibly meet your unique information integration requirements.(a) Data Server(b) Information Server(c) Info Server(d) All of the mentionedI have been asked this question during an online interview.This question is from IBM InfoSphere topic in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» RIGHT ANSWER is (B) Information SERVER

To explain I would say: IBM InfoSphere Information Server is a market-leading data INTEGRATION platform which includes a family of products that enable you to understand, cleanse, monitor, transform, and deliver data.
3.

___________ is used for processing complex transactions and messages.(a) PX(b) Server Edition(c) MVS Edition(d) TXThe question was posed to me in an online interview.The question is from IBM InfoSphere in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct choice is (d) TX

For EXPLANATION: MVS EDITION developed on a Windows or Unix/Linux PLATFORM and transferred to the mainframe as compiled mainframe JOBS.

4.

__________ is a name given to the version of DataStage that had a parallel processing architecture and parallel ETL jobs.(a) Enterprise Edition(b) Server Edition(c) MVS Edition(d) TXThe question was posed to me in class test.My doubt stems from IBM InfoSphere in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT option is (a) Enterprise Edition

The best I can EXPLAIN: DATASTAGE 5 added Sequence Jobs and DataStage 6 added Parallel Jobs via Enterprise Edition.
5.

Point out the wrong statement.(a) InfoSphere DataStage also facilitates extended metadata management and enterprise connectivity(b) Real-Time Integration pack can turn server or parallel jobs into SOA services(c) In 2012 the suite was renamed to InfoSphere Information Server and the product was renamed to InfoSphere DataStage(d) None of the mentionedI had been asked this question by my school principal while I was bunking the class.The origin of the question is IBM InfoSphere in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct option is (C) In 2012 the SUITE was RENAMED to InfoSphere Information Server and the product was renamed to InfoSphere DATASTAGE

Easiest explanation: In 2006 the product was released as part of the IBM Information Server under the Information Management FAMILY but was still known as WebSphere DataStage.

6.

InfoSphere DataStage uses a client/server design where jobs are created and administered via a ________ client against a central repository on a server.(a) Ubuntu(b) Windows(c) Debian(d) SolarisI have been asked this question in a national level competition.My question comes from IBM InfoSphere in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT OPTION is (b) Windows

The EXPLANATION: The IBM InfoSphere DataStage is capable of integrating data on demand across multiple and high volumes of data sources and TARGET applications using a high-performance parallel framework.
7.

InfoSphere DataStage has __________ levels of Parallelism.(a) 1(b) 2(c) 3(d) 4I have been asked this question in an interview for job.I need to ask this question from IBM InfoSphere topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct OPTION is (C) 3

To explain I WOULD say: InfoSphere DataStage also facilitates extended METADATA management and enterprise connectivity.

8.

Point out the correct statement.(a) IBM InfoSphere DataStage is an ETL tool(b) IBM InfoSphere DataStage is a part of the IBM Information Platforms Solutions suite and IBM InfoSphere(c) InfoSphere uses a graphical notation to construct data integration solutions(d) All of the mentionedI have been asked this question in unit test.My question is from IBM InfoSphere in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct option is (d) All of the mentioned

To EXPLAIN I would SAY: InfoSphere DataStage is a POWERFUL data INTEGRATION tool.

9.

EC2 capacity can be increased or decreased in real time from as few as one to more than ___________ virtual machines simultaneously.(a) 1000(b) 2000(c) 3000(d) None of the mentionedThe question was asked in a job interview.The above asked question is from Amazon EC2/S3 services in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct answer is (a) 1000

The best I can EXPLAIN: Billing takes PLACE according to the COMPUTING and network resources CONSUMED.

10.

The EC2 can serve as a practically unlimited set of ___________ machines.(a) virtual(b) real(c) distributed(d) all of the mentionedThe question was asked during a job interview.Origin of the question is Amazon EC2/S3 services in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct option is (a) virtual

Explanation: To use the EC2, a SUBSCRIBER creates an Amazon MACHINE IMAGE (AMI) containing the operating SYSTEM, application programs and configuration settings.

11.

The IBM _____________ Platform provides all the foundational building blocks of trusted information, including data integration, data warehousing, master data management, big data and information governance.(a) InfoStream(b) InfoSphere(c) InfoSurface(d) InfoDataThe question was asked in an internship interview.The question is from IBM InfoSphere topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right answer is (a) InfoStream

The BEST explanation: InfoStream platform PROVIDES an enterprise-class foundation for information-intensive projects, providing the performance, scalability, reliability and acceleration needed to simplify difficult CHALLENGES and deliver trusted information to your BUSINESS faster.

12.

Amazon ___________ is well suited to transfer bulk amount of data.(a) EC2(b) EC3(c) EC4(d) All of the mentionedI had been asked this question during an internship interview.I'd like to ask this question from Amazon EC2/S3 services in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct OPTION is (B) EC3

Easy explanation: AMAZON EC2 ENABLES you to scale up or down to handle changes in requirements or spikes in popularity, REDUCING your need to forecast traffic.

13.

Amazon EC2 provides virtual computing environments, known as __________(a) chunks(b) instances(c) messages(d) none of the mentionedI got this question in an interview.My question is from Amazon EC2/S3 services in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT answer is (B) instances

To ELABORATE: Using Amazon EC2 eliminates your NEED to invest in hardware up fr

14.

Amazon ___________ provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios.(a) EC2(b) EC3(c) EC4(d) All of the mentionedI had been asked this question in unit test.I'd like to ask this question from Amazon EC2/S3 services topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct choice is (a) EC2

For explanation: Amazon EC2 CHANGES the ECONOMICS of computing by allowing you to pay only for capacity that you actually USE.

15.

Point out the wrong statement.(a) Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud(b) Amazon EC2 is designed to make web-scale cloud computing easier for developers.(c) Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction.(d) None of the mentionedThis question was addressed to me during an online interview.My question is taken from Amazon EC2/S3 services in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct choice is (d) None of the mentioned

The best I can EXPLAIN: Amazon EC2 reduces the time required to OBTAIN and BOOT new server instances to minutes.

16.

Amazon ___________ is a Web service that provides real-time monitoring to Amazon’s EC2 customers.(a) AmWatch(b) CloudWatch(c) IamWatch(d) All of the mentionedThe question was posed to me in an interview for internship.My question is from Amazon EC2/S3 services topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct choice is (b) CloudWatch

Explanation: The current AMIs for all CoreOS CHANNELS and EC2 REGIONS are updated FREQUENTLY.

17.

Point out the correct statement.(a) Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services(b) MongoDB runs well on Amazon EC2(c) To deploy MongoDB on EC2 you can either set up a new instance manually(d) All of the mentionedI had been asked this question during an interview for a job.Asked question is from Amazon EC2/S3 services topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT choice is (d) All of the mentioned

Explanation: MongoDB on EC2 can be DEPLOYED EASILY by using SHARDED cluster management.
18.

The Amazon ____________ is a Web-based service that allows business subscribers to run application programs in the Amazon.com computing environment.(a) EC3(b) EC4(c) EMR(d) None of the mentionedThis question was addressed to me at a job interview.My question is based upon Amazon EC2/S3 services topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» RIGHT answer is (d) None of the mentioned

For explanation: Use Amazon EC2 for SCALABLE COMPUTING capacity in the AWS cloud so you can develop and deploy applications without HARDWARE constraints.
19.

Impala executes SQL queries using a _________ engine.(a) MAP(b) MPP(c) MPA(d) None of the mentionedThe question was posed to me in an interview for job.My query is from Amazon Elastic MapReduce topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct answer is (b) MPP

For explanation: Impala avoids Hive’s OVERHEAD from CREATING MapReduce JOBS, giving it FASTER query times than Hive.

20.

Impala on Amazon EMR requires _________ running Hadoop 2.x or greater.(a) AMS(b) AMI(c) AWR(d) All of the mentionedI had been asked this question during an interview.The question is from Amazon Elastic MapReduce topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct choice is (B) AMI

The explanation: IMPALA is an open source tool in the Hadoop ECOSYSTEM for INTERACTIVE, ad hoc querying using SQL SYNTAX.

21.

___________ is an RPC framework that defines a compact binary serialization format used to persist data structures for later analysis.(a) Pig(b) Hive(c) Thrift(d) None of the mentionedThis question was posed to me in exam.I need to ask this question from Amazon Elastic MapReduce in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct option is (b) Hive

For EXPLANATION I WOULD say: AMAZON EMR does not support Hive AUTHORIZATION.

22.

Amazon EMR uses Hadoop processing combined with several __________products.(a) AWS(b) ASQ(c) AMR(d) AWESThis question was posed to me by my school principal while I was bunking the class.This key question is from Amazon Elastic MapReduce in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct answer is (a) AWS

For explanation I would say: AMAZON ELASTIC MapReduce (Amazon EMR) is a web service that makes it easy to process LARGE AMOUNTS of data efficiently.

23.

Hadoop clusters running on Amazon EMR use ______ instances as virtual Linux servers for the master and slave nodes.(a) EC2(b) EC3(c) EC4(d) None of the mentionedI had been asked this question in an internship interview.The question is from Amazon Elastic MapReduce in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT ANSWER is (a) EC2

Explanation: AMAZON EMR has made enhancements to Hadoop and other open-source applications to work SEAMLESSLY with AWS.
24.

Point out the wrong statement.(a) Apache Hive saves Hive log files to /tmp/{user.name}/ in a file named hive.log(b) Amazon EMR saves Hive logs to /mnt/var/log/apps/(c) In order to support concurrent versions of Hive, the version of Hive you run determines the log file name(d) None of the mentionedThis question was addressed to me in an interview for internship.This interesting question is from Amazon Elastic MapReduce in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT answer is (d) None of the mentioned

Explanation: If you have MANY GZip FILES in your Hive cluster, you can OPTIMIZE performance by PASSING multiple files to each mapper.
25.

The Amazon EMR default input format for Hive is __________(a) org.apache.hadoop.hive.ql.io.CombineHiveInputFormat(b) org.apache.hadoop.hive.ql.iont.CombineHiveInputFormat(c) org.apache.hadoop.hive.ql.io.CombineFormat(d) All of the mentionedI have been asked this question in final exam.The query is from Amazon Elastic MapReduce topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right OPTION is (a) org.apache.hadoop.hive.ql.io.CombineHiveInputFormat

The BEST I can explain: You can SPECIFY the hive.base.inputformat option in Hive to select a different file FORMAT,

26.

Point out the correct statement.(a) Amazon Elastic MapReduce (Amazon EMR) provides support for Apache Hive(b) Pig extends the SQL paradigm by including serialization formats and the ability to invoke mapper and reducer scripts(c) The Amazon Hive default input format is text(d) All of the mentionedThe question was asked during an interview.This is a very interesting question from Amazon Elastic MapReduce topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct option is (a) Amazon Elastic MapReduce (Amazon EMR) provides support for Apache Hive

For EXPLANATION I would say: With Hive 0.13.1 on Amazon EMR, certain options INTRODUCED in previous VERSIONS of Hive on EMR have been removed in favor of GREATER parity with Apache Hive. For example, the -x option was removed.

27.

Amazon EMR also allows you to run multiple versions concurrently, allowing you to control your ___________ version upgrade.(a) Pig(b) Windows Server(c) Hive(d) UbuntuThe question was asked in examination.Origin of the question is Amazon Elastic MapReduce in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct choice is (c) HIVE

Best explanation: AMAZON EMR supports several VERSIONS of Hive, which you can install on any running CLUSTER.

28.

Microsoft .NET Library for Avro provides data serialization for the Microsoft ___________ environment.(a) .NET(b) Hadoop(c) Ubuntu(d) None of the mentionedThe question was posed to me during an internship interview.This intriguing question originated from Hadoop on Microsoft Azure topic in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT answer is (a) .NET

The explanation is: The Microsoft .NET LIBRARY for Avro implements the Apache Avro COMPACT binary data interchange format for SERIALIZATION for the Microsoft .NET environment.

29.

Which of the following individual components are included on HDInsight clusters?(a) Hive(b) Pig(c) Oozie(d) All of the mentionedThe question was posed to me during an interview.My question is taken from Hadoop on Microsoft Azure in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct OPTION is (d) All of the mentioned

Explanation: HDInsight provides SEVERAL configurations for specific WORKLOADS, or you can customize clusters using Script ACTIONS.

30.

Microsoft Azure HDInsight comes with __________ types of interactive console.(a) two(b) three(c) four(d) fiveI have been asked this question in an online quiz.My enquiry is from Hadoop on Microsoft Azure in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT option is (a) two

To explain I would say: One is the STANDARD Hadoop HIVE console, the other one is unique in Hadoop world, it is based on JavaScript.
31.

The key _________ command – which is traditionally a bash script – is also re-implemented as hadoop.cmd.(a) start(b) hadoop(c) had(d) hadstratI had been asked this question in class test.Question is taken from Hadoop on Microsoft Azure topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct CHOICE is (B) HADOOP

The BEST explanation: HDInsight is the framework for the Microsoft Azure cloud implementation of Hadoop.

32.

Point out the wrong statement.(a) The other flavor of HDInsight interactive console is based on JavaScript(b) Microsoft and Hortonworks have re-implemented the key binaries as executables(c) The distribution consists of Hadoop 1.1.0, Pig-0.9.3, Hive 0.9.0, Mahout 0.5 and Sqoop 1.4.2(d) All of the mentionedI had been asked this question in semester exam.The question is from Hadoop on Microsoft Azure topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT choice is (d) All of the mentioned

The EXPLANATION is: JAVASCRIPT commands are converted to Pig STATEMENTS.

33.

In Hadoop _____________ go to the Hadoop distribution directory for HDInsight.(a) Shell(b) Command Line(c) Compaction(d) None of the mentionedThe question was asked by my school teacher while I was bunking the class.My doubt is from Hadoop on Microsoft Azure in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct ANSWER is (b) Command Line

The explanation is: In order to RUN Hadoop command line from WINDOWS cmd prompt, you need to login to the HDInsight HEAD node USING Remote Desktop.

34.

Hadoop ___________ is a utility to support running external map and reduce jobs.(a) Orchestration(b) Streaming(c) Collection(d) All of the mentionedI had been asked this question by my college professor while I was bunking the class.My doubt stems from Hadoop on Microsoft Azure topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right CHOICE is (B) Streaming

The BEST explanation: These external jobs can be WRITTEN in various programming LANGUAGES such as Python or Ruby.

35.

Microsoft and Hortonworks joined their forces to make Hadoop available on ___________ for on-premise deployments.(a) Windows 7(b) Windows Server(c) Windows 8(d) UbuntuI had been asked this question in an international level competition.I want to ask this question from Hadoop on Microsoft Azure in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT choice is (b) WINDOWS Server

To elaborate: Win32 is supported as a development PLATFORM.

36.

Point out the correct statement.(a) Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes(b) GNU/Linux is supported as a development and production platform(c) Distributed operation has not been well tested on Win32, so it is not supported as a production platform(d) All of the mentionedThe question was asked in semester exam.The question is from Hadoop on Microsoft Azure in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct CHOICE is (d) All of the mentioned

To explain I would say: MICROSOFT and HORTONWORKS joined their forces to make Hadoop available on Windows AZURE to SUPPORT big data in the cloud.

37.

To configure short-circuit local reads, you will need to enable ____________ on local Hadoop.(a) librayhadoop(b) libhadoop(c) libhad(d) none of the mentionedI had been asked this question in examination.This interesting question is from Local Hadoop Cloudera topic in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct CHOICE is (b) libhadoop

Explanation: Short-circuit READS MAKE use of a UNIX DOMAIN socket.

38.

_______ is an open source set of libraries, tools, examples, and documentation engineered.(a) Kite(b) Kize(c) Ookie(d) All of the mentionedThe question was asked by my college professor while I was bunking the class.This interesting question is from Local Hadoop Cloudera topic in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct OPTION is (a) Kite

Easiest explanation: Kite is used to SIMPLIFY the most COMMON TASKS when building applications on TOP of Hadoop.

39.

__________ is a online NoSQL developed by Cloudera.(a) HCatalog(b) Hbase(c) Imphala(d) OozieThis question was posed to me in an international level competition.I need to ask this question from Local Hadoop Cloudera in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct option is (B) HBASE

Easy explanation: HBase is a distributed KEY VALUE STORE.

40.

Point out the wrong statement.(a) CDH contains the main, core elements of Hadoop(b) In October 2012, Cloudera announced the Cloudera Impala project(c) CDH may be downloaded from Cloudera’s website at no charge(d) None of the mentionedI have been asked this question in homework.This is a very interesting question from Local Hadoop Cloudera in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT option is (d) None of the mentioned

To EXPLAIN: CDH may be DOWNLOADED from Cloudera’s website with no technical SUPPORT nor Cloudera Manager.

41.

Cloudera Enterprise comes in ___________ edition.(a) One(b) Two(c) Three(d) FourI got this question during an online exam.This intriguing question originated from Local Hadoop Cloudera topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» RIGHT OPTION is (C) Three

Best explanation: Cloudera Enterprise COMES in three editions: Basic, Flex, and DATA Hub.
42.

Cloudera ___________ includes CDH and an annual subscription license (per node) to Cloudera Manager and technical support.(a) Enterprise(b) Express(c) Standard(d) All of the mentionedI got this question in a job interview.The above asked question is from Local Hadoop Cloudera topic in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Correct OPTION is (a) Enterprise

Easiest EXPLANATION: CDH includes the core elements of Apache Hadoop plus SEVERAL additional key open source PROJECTS.

43.

Cloudera Express includes CDH and a version of Cloudera ___________ lacking enterprise features such as rolling upgrades and backup/disaster recovery.(a) Enterprise(b) Express(c) Standard(d) ManagerI had been asked this question by my school principal while I was bunking the class.My doubt is from Local Hadoop Cloudera topic in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right OPTION is (d) Manager

To explain I would SAY: All VERSIONS MAY be downloaded from Cloudera’s WEBSITE.

44.

Point out the correct statement.(a) Cloudera is also a sponsor of the Apache Software Foundation(b) CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch processing, interactive SQL, and interactive search, and role-based access controls(c) More enterprises have downloaded CDH than all other such distributions combined(d) All of the mentionedI had been asked this question during an interview.This interesting question is from Local Hadoop Cloudera topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right ANSWER is (d) All of the mentioned

To explain: CLOUDERA says that more than 50% of its engineering output is DONATED UPSTREAM to the VARIOUS Apache-licensed open source projects.

45.

___________ is the world’s most complete, tested, and popular distribution of Apache Hadoop and related projects.(a) MDH(b) CDH(c) ADH(d) BDHThis question was addressed to me during an interview.This intriguing question originated from Local Hadoop Cloudera topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right choice is (b) CDH

Explanation: CLOUDERA’s open-source Apache HADOOP DISTRIBUTION, CDH (Cloudera Distribution Including Apache Hadoop), targets enterprise-class DEPLOYMENTS of that TECHNOLOGY.

46.

__________ is a log collection and correlation software with reporting and alarming functionalities.(a) Lucene(b) ALOIS(c) Imphal(d) None of the mentionedThis question was posed to me in semester exam.Query is from Hadoop Utilities topic in section Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The CORRECT answer is (b) ALOIS

Explanation: This PROJECT ACTIVITY is transferred to ANOTHER INCUBATOR project – ODE.

47.

Apache _________ is a project that enables development and consumption of REST style web services.(a) Wives(b) Wink(c) Wig(d) All of the mentionedThis question was posed to me in an online interview.Question is taken from Hadoop Utilities topic in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» RIGHT choice is (b) Wink

To EXPLAIN: The core SERVER runtime is based on the JAX-RS (JSR 311) standard.
48.

Which of the following is a standard compliant XML Query processor?(a) Whirr(b) VXQuery(c) Knife(d) LensI have been asked this question in semester exam.My query is from Hadoop Utilities in division Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

Right choice is (b) VXQuery

Easy EXPLANATION: Whirr provides CODE for running a VARIETY of software SERVICES on cloud INFRASTRUCTURE.

49.

___________ is a distributed data warehouse system for Hadoop.(a) Stratos(b) Tajo(c) Sqoop(d) LuceneThe question was asked in an online quiz.The above asked question is from Hadoop Utilities in chapter Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer» CORRECT choice is (b) Tajo

The best EXPLANATION: Sqoop is a TOOL designed for efficiently transferring bulk data between APACHE Hadoop and structured datastores such as RELATIONAL databases.
50.

___________ is a distributed, fault-tolerant, and high-performance realtime computation system.(a) Knife(b) Storm(c) Sqoop(d) LuceneI have been asked this question in an international level competition.The above asked question is from Hadoop Utilities in portion Oozie, Orchestration, Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications of Hadoop

Answer»

The correct ANSWER is (B) Storm

Explanation: Storm PROVIDES strong GUARANTEES on the PROCESSING of data.