[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-10 [2012/01/25 18:37]
straka
courses:mapreduce-tutorial:step-10 [2012/01/25 19:01]
straka
Line 1: Line 1:
 ====== MapReduce Tutorial : Combiners ====== ====== MapReduce Tutorial : Combiners ======
  
 +Sometimes the reduce is a binary operation, which is associative and commutative, e.g. ''+''. In that case it is inefficient to produce all the (key, value) pairs in the mappers and send them through the network.
 +
 +Instead, reducer can be executed right after the map, on //some portion// of values belonging to the same key. Only the results are then sent through the network.
 +
 +A Hadoop job can have such locally executed reducer, called //combiner//. If a combiner is specified, the output of a mapper is processed by a combiner before sending the pairs to reducer. The combiner may be invoked 0, 1 or multiple times, usually when the data are written to disk.
 +
 +Typically, the combiner is the same as the reducer of a MR job.
 +
 +<code perl>
 +package Mapper;
 +use Moose;
 +with 'Hadoop::Mapper';
 +
 +sub map {
 +  my ($self, $key, $value, $context) = @_;
 +
 +  foreach my $word (split /\W/, $value) {
 +    next if not length $word;
 +    $context->write($word, 1);
 +  }
 +}
 +
 +package Reducer;
 +use Moose;
 +with 'Hadoop::Reducer';
 +
 +sub reduce {
 +  my ($self, $key, $values, $context) = @_;
 +
 +  my $sum = 0;
 +  while ($values->next) {
 +    $sum += $values->value;
 +  }
 +
 +  $context->write($key, $sum);
 +}
 +
 +package Main;
 +use Hadoop::Runner;
 +
 +my $runner = Hadoop::Runner->new(
 +  mapper => Mapper->new(),
 +  combiner => Reducer->new(),
 +  reducer => Reducer->new(),
 +  input_format => 'KeyValueTextInputFormat');
 +
 +$runner->run();
 +</code>
  

[ Back to the navigation ] [ Back to the content ]