1.

What Are The Various Levels Of Persistence In Apache Spark?

Answer»

Apache Spark automatically persists the INTERMEDIARY data from various shuffle OPERATIONS, however it is often suggested that users call persist () method on the RDD in CASE they plan to reuse it. Spark has various persistence levels to store the RDDs on disk or in memory or as a combination of both with DIFFERENT replication levels.

The various storage/persistence levels in Spark are:

MEMORY_ONLY
MEMORY_ONLY_SER
MEMORY_AND_DISK
MEMORY_AND_DISK_SER, DISK_ONLY
OFF_HEAP

Apache Spark automatically persists the intermediary data from various shuffle operations, however it is often suggested that users call persist () method on the RDD in case they plan to reuse it. Spark has various persistence levels to store the RDDs on disk or in memory or as a combination of both with different replication levels.

The various storage/persistence levels in Spark are:

MEMORY_ONLY
MEMORY_ONLY_SER
MEMORY_AND_DISK
MEMORY_AND_DISK_SER, DISK_ONLY
OFF_HEAP



Discussion

No Comment Found