Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-8 [2012/01/25 15:00] straka |
courses:mapreduce-tutorial:step-8 [2012/01/25 15:16] straka |
||
---|---|---|---|
Line 19: | Line 19: | ||
By default, (key, value) pair is sent to reducer number //hash(key) modulo number_of_reducers// | By default, (key, value) pair is sent to reducer number //hash(key) modulo number_of_reducers// | ||
- | To override the default behaviour, MR job can specify a // | + | To override the default behaviour, MR job can specify a // |
+ | |||
+ | <code perl> | ||
+ | package Partitioner; | ||
+ | use Moose; | ||
+ | with ' | ||
+ | |||
+ | sub getPartition { | ||
+ | my ($self, $key, $value, $partitions) = @_; | ||
+ | |||
+ | return $key % $partitions; | ||
+ | } | ||
+ | |||
+ | ... | ||
+ | package Main; | ||
+ | use Hadoop:: | ||
+ | |||
+ | my $runner = Hadoop:: | ||
+ | ... | ||
+ | partitioner => Partitioner-> | ||
+ | ...); | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | A MR job must have a reducer if it specifies a partitioner. Also, the partitioner is not called if there is only one reducer. | ||