InterviewSolution
| 1. |
Explain the key features of Apache Spark. |
|
Answer» Apache Spark has the following key FEATURES:
Polyglot Spark code can be written in Java, Scala, Python or R. It also provides interactive modes in Scala and Python. Performance: Apache Spark is unto 100 times faster than MapReduce. Data Formats: Spark supports multiple data sources such as Parquet, CSV, JSON, HIVE, Cassandra and HBase. Lazy Evaluation : Spark delays its execution until it is necessary. For transformations, Spark adds them to DAG and executes when action performed. Real-time computation : Spark computation at real-time has less latency because of its in-memory computation and maximum use of the cluster. Hadoop Integration : Spark provides good compatibility with Hadoop. Spark is a potential replacement of MapReduce functions of Hadoop as Spark can run on top of an existing Hadoop cluster using YARN. Machine Learning: As Spark has MANY in-built libraries along with Mlib library, Spark provides Data ENGINEERS and Data SCIENTIST with as powerful unified engine that is fast and easy to use. |
|