InterviewSolution
Saved Bookmarks
| 1. |
Why do we need broadcast variables in Spark? |
|
Answer» Broadcast variables let the developers maintain read-only variables cached on each machine instead of SHIPPING a COPY of it with tasks. They are used to give every node copy of a large input DATASET efficiently. These variables are broadcasted to the nodes using different algorithms to REDUCE the cost of COMMUNICATION. |
|