1.

What is the key Spark-Driver component to handle the execution of Big Data?

Answer»
  • DAGScheduler:

DAGScheduler is the scheduling layer of Apache Spark that implements stage-oriented scheduling. It transforms a logical execution plan into the DAGScheduler which is the scheduling layer of Apache Spark that implements stage-oriented scheduling. SparkContext hands over a logical plan to DAGScheduler that it in turn TRANSLATES to a set of stages that are submitted as TaskSets for execution.

TaskScheduler is responsible for submitting tasks for execution in a Spark application. TaskScheduler tracks the executors in a Spark application using executorHeartbeatReceived and executor Lost methods that are to inform about active and lost executors, respectively. Spark comes with the following custom TaskSchedulers: TaskSchedulerImpl — the default TaskScheduler (that the following two YARN-specific TaskSchedulers extend). YarnScheduler for Spark on YARN in client deploy mode. YarnClusterScheduler for Spark on YARN in cluster deploy mode.

  • BackendScheduler:

BackendScheduler is a pluggable interface to support various cluster managers, cluster managers differ by their custom TASK scheduling modes and resource offers mechanisms Spark ABSTRACTS the differences in BackendScheduler contract.

  • BlockManager:

Responsible for the translation of spark user code into actual spark JOBS executed on the cluster.



Discussion

No Comment Found