[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-8 [2012/01/25 14:54]
straka
courses:mapreduce-tutorial:step-8 [2012/01/25 15:00]
straka
Line 9: Line 9:
  
 ===== Multiple reducers ===== ===== Multiple reducers =====
 +Then number of reducers is specified by the job, default number is one. As the outputs of reducers are not merged, there are as many output files as reducers.
  
 +To use multiple reducers, the MR job must be executed by a cluster (even with one computer), not locally. The number of reducers is specified by ''-r'' flag:
 +  perl script.pl [-j cluster_master | -c cluster_size [-w sec_to_wait]] [-r number_of_reducers]
 +
 +==== Partitioning ====
 +When there are multiple reducers, it is important how the (key, value) pairs are distributed between the reducers.
 +
 +By default, (key, value) pair is sent to reducer number //hash(key) modulo number_of_reducers//. This guarantees that for one key, all its values are processed by unique reducer.
 +
 +To override the default behaviour, MR job can specify a //partitioner//.
  

[ Back to the navigation ] [ Back to the content ]