1.

What is Distcp?

Answer»

It is a Tool which is used for copying a very large AMOUNT of data to and from HADOOP file systems in parallel. It uses MapReduce to affect its distribution, ERROR handling, recovery, and reporting. It expands a LIST of files and directories into input to map tasks, each of which will copy a PARTITION of the files specified in the source list.   



Discussion

No Comment Found