1.

How is Apache Spark different from MapReduce?

Answer»
MapReduceApache Spark
MapReduce does only batch-wise processing of data.Apache Spark can process the data both in real-time and in batches.
MapReduce does SLOW processing of large data.Apache Spark runs approximately 100 times faster than MapReduce for big data processing.
MapReduce stores data in HDFS (Hadoop Distributed File System) which makes it take a LONG time to GET the data.Spark stores data in memory (RAM) which makes it easier and faster to retrieve data when needed.
MapReduce highly depends on disk which makes it to be a high latency framework.Spark supports in-memory data storage and caching and makes it a low latency computation framework.
MapReduce requires an EXTERNAL scheduler for jobs.Spark has its own job scheduler due to the in-memory data computation.


Discussion

No Comment Found