1.

Why do we need broadcast variables in Spark?

Answer»

Broadcast variables let the developers maintain read-only variables cached on each machine instead of SHIPPING a COPY of it with tasks. They are used to give every node copy of a large input DATASET efficiently. These variables are broadcasted to the nodes using different algorithms to REDUCE the cost of COMMUNICATION.



Discussion

No Comment Found