[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-5 [2012/01/24 22:08]
straka
courses:mapreduce-tutorial:step-5 [2012/01/24 22:15]
straka
Line 1: Line 1:
 ====== MapReduce Tutorial : Basic reducer ====== ====== MapReduce Tutorial : Basic reducer ======
 +
 +The interesting part of a MR job is the reducer -- after all mappers produce the (key, value) pairs, for every unique key and all its values a ''reduce'' function is called. The ''reduce'' function can output (key, value) pairs, which are written to disk.
 +
 +The ''reduce'' is similar to ''map'', but instead of one value it gets an iterator, which can enumerate all values:
  
 <file perl reducer.pl> <file perl reducer.pl>
-#!/usr/bin/perl 
-  
 package Mapper; package Mapper;
 use Moose; use Moose;
 with 'Hadoop::Mapper'; with 'Hadoop::Mapper';
- +
 sub map { sub map {
   my ($self, $key, $value, $context) = @_;   my ($self, $key, $value, $context) = @_;
- +
   $context->write($key, $value);   $context->write($key, $value);
 } }
- + 
 +package Reducer; 
 +use Moose; 
 +with 'Hadoop::Reducer'; 
 + 
 +sub reduce { 
 +  my ($self, $key, $values, $context) = @_; 
 + 
 +  while ($values->next) { 
 +    $context->write($key, $values->value); 
 +  } 
 +
 package Main; package Main;
 use Hadoop::Runner; use Hadoop::Runner;
- +
 my $runner = Hadoop::Runner->new( my $runner = Hadoop::Runner->new(
   mapper => Mapper->new(),   mapper => Mapper->new(),
-  input_format => 'TextInputFormat', +  reducer => Reducer->new()); 
-  output_format => 'TextOutputFormat', +
-  output_compression =0); +
- +
 $runner->run(); $runner->run();
 </file> </file>
  

[ Back to the navigation ] [ Back to the content ]