1.

Explain the distributed Cache in MapReduce framework.

Answer»

Distributed Cache is a significant feature provided by the MapReduce Framework, practiced when you want to share the FILES across all nodes in a Hadoop cluster. These files can be jar files or simple properties files.

Hadoop's MapReduce framework allows the facility to cache SMALL to moderate read-only files such as text files, ZIP files, jar files, ETC., and distribute them to all the Datanodes(worker-nodes) MapReduce jobs are running. All Datanode gets a COPY of the file(local-copy), which is sent by Distributed Cache. 



Discussion

No Comment Found