Explain The Wordcount Implementation Via Hadoop Framework ?

1.	Explain The Wordcount Implementation Via Hadoop Framework ?
Answer» We will count the words in all the input file flow as below input Assume there are two files each having a sentence Hello World Hello World (In file 1) Hello World Hello World (In file 2) Mapper : There would be each mapper for the a file For the given sample input the first map output: < Hello, 1> < World, 1> < Hello, 1> < World, 1> The second map output: < Hello, 1> < World, 1> < Hello, 1> < World, 1> Combiner/Sorting (This is done for each INDIVIDUAL map) So output looks like this The output of the first map: < Hello, 2> < World, 2> The output of the second map: < Hello, 2> < World, 2> REDUCER : It SUMS up the above output and generates the output as below < Hello, 4> < World, 4> Output Final output would look like Hello 4 times World 4 times We will count the words in all the input file flow as below < Hello, 1> < World, 1> < Hello, 1> < World, 1> The second map output: < Hello, 1> < World, 1> < Hello, 1> < World, 1> < Hello, 2> < World, 2> The output of the second map: < Hello, 2> < World, 2> < Hello, 4> < World, 4> Output Final output would look like Hello 4 times World 4 times

Answer»

We will count the words in all the input file flow as below

input Assume there are two files each having a sentence Hello World Hello World (In file 1) Hello World Hello World (In file 2)
Mapper : There would be each mapper for the a file For the given sample input the first map output:

< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>

The second map output:

< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>

Combiner/Sorting (This is done for each INDIVIDUAL map) So output looks like this The output of the first map:

< Hello, 2>
< World, 2>

The output of the second map:

< Hello, 2>
< World, 2>

< Hello, 4>
< World, 4>

Output

Final output would look like

Hello 4 times
World 4 times

We will count the words in all the input file flow as below

< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>

The second map output:

< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>

< Hello, 2>
< World, 2>

The output of the second map:

< Hello, 2>
< World, 2>

< Hello, 4>
< World, 4>

Output

Final output would look like

Hello 4 times
World 4 times

Discussion