|
Answer» Deploy mode determines where the driver program will run on the Spark cluster. There are two types of deploy modes in Spark: - Client mode – In this mode, the driver program will run on the machine from where the Spark job is submitted. This mode can be chosen when the machine from which job is submitted is near to the cluster and/or there is little latency between the driver and spark cluster. In addition, the job submitting machine must stay up and connected to the cluster as long as the job is running. In case this machine gets shut down or disconnected, the job execution will FAIL and would need to be recomputed.
This mode can be chosen for running small jobs. - Cluster mode – In this mode, the driver will get launched inside the Spark cluster. As both the driver and worker nodes are operating in the same cluster, there is no latency. This also MAKES the application fault-tolerant as the cluster manager tries to RELAUNCH the driver on ANOTHER node in case the driver node fails.
Note: Deploy mode can be defined at the time of submitting the spark job.
|