[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-10 [2012/01/25 15:46]
straka vytvořeno
courses:mapreduce-tutorial:step-10 [2012/01/25 19:01]
straka
Line 1: Line 1:
-====== MapReduce Tutorial :  ======+====== MapReduce Tutorial : Combiners ====== 
 + 
 +Sometimes the reduce is a binary operation, which is associative and commutative, e.g. ''+''. In that case it is inefficient to produce all the (key, value) pairs in the mappers and send them through the network. 
 + 
 +Instead, reducer can be executed right after the map, on //some portion// of values belonging to the same key. Only the results are then sent through the network. 
 + 
 +A Hadoop job can have such locally executed reducer, called //combiner//. If a combiner is specified, the output of a mapper is processed by a combiner before sending the pairs to reducer. The combiner may be invoked 0, 1 or multiple times, usually when the data are written to disk. 
 + 
 +Typically, the combiner is the same as the reducer of a MR job. 
 + 
 +<code perl> 
 +package Mapper; 
 +use Moose; 
 +with 'Hadoop::Mapper'; 
 + 
 +sub map { 
 +  my ($self, $key, $value, $context) @_; 
 + 
 +  foreach my $word (split /\W/, $value) { 
 +    next if not length $word; 
 +    $context->write($word, 1); 
 +  } 
 +
 + 
 +package Reducer; 
 +use Moose; 
 +with 'Hadoop::Reducer'; 
 + 
 +sub reduce { 
 +  my ($self, $key, $values, $context) @_; 
 + 
 +  my $sum 0; 
 +  while ($values->next) { 
 +    $sum +$values->value; 
 +  } 
 + 
 +  $context->write($key, $sum); 
 +
 + 
 +package Main; 
 +use Hadoop::Runner; 
 + 
 +my $runner Hadoop::Runner->new( 
 +  mapper => Mapper->new(), 
 +  combiner => Reducer->new(), 
 +  reducer => Reducer->new(), 
 +  input_format => 'KeyValueTextInputFormat'); 
 + 
 +$runner->run(); 
 +</code> 

[ Back to the navigation ] [ Back to the content ]