Explain the key features of Apache Spark.

1.	Explain the key features of Apache Spark.
Answer» Apache Spark has the following key FEATURES: Polyglot. Performance. Data sources. Lazy Evaluation Real-time computation. Hadoop Integration Machine Learning *Polyglot* Spark code can be written in Java, Scala, Python or R. It also provides interactive modes in Scala and Python. *Performance: Apache Spark is unto 100 times faster than MapReduce. Data Formats: Spark supports multiple data sources such as Parquet, CSV, JSON, HIVE, Cassandra and HBase. Lazy Evaluation* : Spark delays its execution until it is necessary. For transformations, Spark adds them to DAG and executes when action performed. *Real-time computation* : Spark computation at real-time has less latency because of its in-memory computation and maximum use of the cluster. *Hadoop Integration :* Spark provides good compatibility with Hadoop. Spark is a potential replacement of MapReduce functions of Hadoop as Spark can run on top of an existing Hadoop cluster using YARN. *Machine Learning:* As Spark has MANY in-built libraries along with Mlib library, Spark provides Data ENGINEERS and Data SCIENTIST with as powerful unified engine that is fast and easy to use.

Answer»

Apache Spark has the following key FEATURES:

Polyglot.
Performance.
Data sources.
Lazy Evaluation
Real-time computation.
Hadoop Integration
Machine Learning

Polyglot

Spark code can be written in Java, Scala, Python or R. It also provides interactive modes in Scala and Python.

Performance:

Apache Spark is unto 100 times faster than MapReduce.

Data Formats:

Spark supports multiple data sources such as Parquet, CSV, JSON, HIVE, Cassandra and HBase.

Lazy Evaluation :

Spark delays its execution until it is necessary. For transformations, Spark adds them to DAG and executes when action performed.

Real-time computation :

Spark computation at real-time has less latency because of its in-memory computation and maximum use of the cluster.

Hadoop Integration :

Spark provides good compatibility with Hadoop. Spark is a potential replacement of MapReduce functions of Hadoop as Spark can run on top of an existing Hadoop cluster using YARN.

Machine Learning:

As Spark has MANY in-built libraries along with Mlib library, Spark provides Data ENGINEERS and Data SCIENTIST with as powerful unified engine that is fast and easy to use.

Explain the key features of Apache Spark.

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment