InterviewSolution
| 1. |
How to process Big Data? |
|
Answer» There are various frameworks for Big Data processing. One of the most popular is MapReduce. It consists of mainly two phases CALLED Map phase and the Reduce phase. In between Map and Reduce phase there is an INTERMEDIATE phase called SHUFFLE. The given job is divided into two tasks:
The input is divided into splits of fixed size. Each input split is then given to each mapper. The mappers run in parallel. So the EXECUTION time is drastically reduced and we get the output very fast. The input to the mapper is a key-value pair. The output of mappers is another key-value pair. This intermediate result is then shuffled and given to reducers. The output of reducers is your DESIRED output. |
|