What is the key Spark-Driver component to handle the execution of Big

1.	What is the key Spark-Driver component to handle the execution of Big Data?
Answer» DAGScheduler: DAGScheduler is the scheduling layer of Apache Spark that implements stage-oriented scheduling. It transforms a logical execution plan into the DAGScheduler which is the scheduling layer of Apache Spark that implements stage-oriented scheduling. SparkContext hands over a logical plan to DAGScheduler that it in turn TRANSLATES to a set of stages that are submitted as TaskSets for execution. TASKSCHEDULER: TaskScheduler is responsible for submitting tasks for execution in a Spark application. TaskScheduler tracks the executors in a Spark application using executorHeartbeatReceived and executor Lost methods that are to inform about active and lost executors, respectively. Spark comes with the following custom TaskSchedulers: TaskSchedulerImpl — the default TaskScheduler (that the following two YARN-specific TaskSchedulers extend). YarnScheduler for Spark on YARN in client deploy mode. YarnClusterScheduler for Spark on YARN in cluster deploy mode. BackendScheduler: BackendScheduler is a pluggable interface to support various cluster managers, cluster managers differ by their custom TASK scheduling modes and resource offers mechanisms Spark ABSTRACTS the differences in BackendScheduler contract. BlockManager: Responsible for the translation of spark user code into actual spark JOBS executed on the cluster.

What is the key Spark-Driver component to handle the execution of Big Data?

Answer»

DAGScheduler:

DAGScheduler is the scheduling layer of Apache Spark that implements stage-oriented scheduling. It transforms a logical execution plan into the DAGScheduler which is the scheduling layer of Apache Spark that implements stage-oriented scheduling. SparkContext hands over a logical plan to DAGScheduler that it in turn TRANSLATES to a set of stages that are submitted as TaskSets for execution.

TASKSCHEDULER:

TaskScheduler is responsible for submitting tasks for execution in a Spark application. TaskScheduler tracks the executors in a Spark application using executorHeartbeatReceived and executor Lost methods that are to inform about active and lost executors, respectively. Spark comes with the following custom TaskSchedulers: TaskSchedulerImpl — the default TaskScheduler (that the following two YARN-specific TaskSchedulers extend). YarnScheduler for Spark on YARN in client deploy mode. YarnClusterScheduler for Spark on YARN in cluster deploy mode.

BackendScheduler:

BackendScheduler is a pluggable interface to support various cluster managers, cluster managers differ by their custom TASK scheduling modes and resource offers mechanisms Spark ABSTRACTS the differences in BackendScheduler contract.

BlockManager:

Responsible for the translation of spark user code into actual spark JOBS executed on the cluster.

What is the key Spark-Driver component to handle the execution of Big Data?

Discussion

No Comment Found

Related InterviewSolutions

Reply to Comment