Saved Bookmarks
| 1. |
What is Distcp? |
|
Answer» It is a Tool which is used for copying a very large AMOUNT of data to and from HADOOP file systems in parallel. It uses MapReduce to affect its distribution, ERROR handling, recovery, and reporting. It expands a LIST of files and directories into input to map tasks, each of which will copy a PARTITION of the files specified in the source list. |
|