Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-8 [2012/01/29 21:04] straka |
courses:mapreduce-tutorial:step-8 [2012/01/30 00:38] straka Improving package names of Perl programs. |
||
---|---|---|---|
Line 21: | Line 21: | ||
By default, (key, value) pair is sent to a reducer number //hash(key) modulo number_of_reducers// | By default, (key, value) pair is sent to a reducer number //hash(key) modulo number_of_reducers// | ||
- | To override the default behaviour, MR job can specify a // | + | To override the default behaviour, MR job can specify a // |
+ | |||
+ | A partitioner should be provided if | ||
+ | * the default partitioner fails to distribute the data between reducers equally, i.e., some of the reducers operate on much more data than others. | ||
+ | * you need an explicit control of (key, value) placement. This can happen for example when [[.:step-13|sorting data]]. | ||
<code perl> | <code perl> | ||
- | package Partitioner; | + | package |
use Moose; | use Moose; | ||
with ' | with ' | ||
Line 35: | Line 39: | ||
... | ... | ||
- | package | + | package |
use Hadoop:: | use Hadoop:: | ||
my $runner = Hadoop:: | my $runner = Hadoop:: | ||
... | ... | ||
- | partitioner => Partitioner-> | + | partitioner => My::Partitioner-> |
...); | ...); | ||
... | ... | ||
Line 56: | Line 60: | ||
Run one MR job on '/ | Run one MR job on '/ | ||
wget --no-check-certificate ' | wget --no-check-certificate ' | ||
+ | # NOW EDIT THE FILE | ||
+ | # $EDITOR step-8-exercise.pl | ||
rm -rf step-8-out-ex; | rm -rf step-8-out-ex; | ||
less step-8-out-ex/ | less step-8-out-ex/ | ||
Line 62: | Line 68: | ||
You can also download the solution {{: | You can also download the solution {{: | ||
wget --no-check-certificate ' | wget --no-check-certificate ' | ||
+ | # NOW VIEW THE FILE | ||
+ | # $EDITOR step-8-solution.pl | ||
rm -rf step-8-out-sol; | rm -rf step-8-out-sol; | ||
less step-8-out-sol/ | less step-8-out-sol/ |