[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-8 [2012/01/25 15:00]
straka
courses:mapreduce-tutorial:step-8 [2012/01/25 15:16]
straka
Line 19: Line 19:
 By default, (key, value) pair is sent to reducer number //hash(key) modulo number_of_reducers//. This guarantees that for one key, all its values are processed by unique reducer. By default, (key, value) pair is sent to reducer number //hash(key) modulo number_of_reducers//. This guarantees that for one key, all its values are processed by unique reducer.
  
-To override the default behaviour, MR job can specify a //partitioner//.+To override the default behaviour, MR job can specify a //partitioner//. A partitioner is given each (key, value) pair produced by a mapper, number of reducers, and outputs the zero-based number of reducer, where this (key, value) pair belongs: 
 + 
 +<code perl> 
 +package Partitioner; 
 +use Moose; 
 +with 'Hadoop::Partitioner'; 
 + 
 +sub getPartition { 
 +  my ($self, $key, $value, $partitions) = @_; 
 + 
 +  return $key % $partitions; 
 +
 + 
 +... 
 +package Main; 
 +use Hadoop::Runner; 
 + 
 +my $runner = Hadoop::Runner->new( 
 +  ... 
 +  partitioner => Partitioner->new(), 
 +  ...); 
 +... 
 +</code> 
 + 
 +A MR job must have a reducer if it specifies a partitioner. Also, the partitioner is not called if there is only one reducer.
  

[ Back to the navigation ] [ Back to the content ]