1.

Why we need the master driver in spark?

Answer»

Master driver is central point and the entry point of the Spark Shell which is supporting this LANGUAGE (Scala, Python, and R). Below is the sequential process, which driver follows to execute the spark job.

  • Driver runs the main () function of the application which create the spark context.
  • Driver program that runs on the master node of the spark cluster schedules the job execution.
  • Translates the RDD’s into the execution graph and splits the graph into multiple stages.
  • Driver stores the metadata about all the RESILIENT Distributed Databases and their partitions.
  • Driver program converts a user application into SMALLER execution units known as tasks which is also as a stage.
  • Tasks are then executed by the executors i.e. the worker processes which run individual tasks.

The complete process can track by cluster manager user interface. Driver exposes the information about the running spark application through a WEB UI at port 4040



Discussion

No Comment Found