[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-8 [2012/01/29 21:04]
straka
courses:mapreduce-tutorial:step-8 [2012/01/29 21:12]
straka
Line 21: Line 21:
 By default, (key, value) pair is sent to a reducer number //hash(key) modulo number_of_reducers//. This guarantees that for one key, all its values are processed by a unique reducer. By default, (key, value) pair is sent to a reducer number //hash(key) modulo number_of_reducers//. This guarantees that for one key, all its values are processed by a unique reducer.
  
-To override the default behaviour, MR job can specify a //partitioner//. A partitioner is given every (key, value) pair produced by a mapper, it is also given the number of reducers, and outputs the zero-based number of reducer, where this (key, value) pair belongs:+To override the default behaviour, MR job can specify a //partitioner//. A partitioner is given every (key, value) pair produced by a mapper, it is also given the number of reducers, and outputs the zero-based number of reducer, where this (key, value) pair belongs
 + 
 +A partitioner should be provided if 
 +  * the default partitioner fails to distribute the data between reducers equally, i.e., some of the reducers operate on much more data than others. 
 +  * you need an explicit control of (key, value) placement. This can happen for example when [[.:step-13|sorting data]].
  
 <code perl> <code perl>

[ Back to the navigation ] [ Back to the content ]