This is an old revision of the document!
Table of Contents
MapReduce Tutorial : Multiple mappers, reducers and partitioning
In order to achieve parallelism, mappers and reducers must be executed in parallel.
Multiple mappers
The number of mappers is determined automatically according to input files sizes. Every input file is divided into splits. The default split size is 32MB. Every file split is then executed by a different mapper.
The size of file split can be overridden by mapred.min.split.size
and maperd.max.split.size
. See the next tutorial step for how to set these flags.