1.

How To Configure The Split Value?

Answer»

By default block SIZE = 64MB, but to process the data, JOB tracker split the data. Hadoop architect use these formulas to know split size.

  1. split size = min (max_splitsize, max (block_size, min_split_size));
  2. split size = max(min_split_size, min (block_size, max_split, size));

by default split size = block size 
Always No of splits = No of mappers.

Apply above FORMULA:

  • split size = Min (max_splitsize, max (64, 512kB) // max _splitsize = depends on env, may 1gb or 10GB split size = min (10gb (let assume), 64) split size = 64MB.
  • split size = max(min_split_size, min (block_size, max_split, size)); split size = max (512kb, min (64, 10GB)); split size = max (512kb, 64);split size = 64 MB;

By default block size = 64mb, but to process the data, job tracker split the data. Hadoop architect use these formulas to know split size.

by default split size = block size 
Always No of splits = No of mappers.

Apply above formula:



Discussion

No Comment Found