Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial:step-5 [2012/01/24 22:08]
straka
+++ courses:mapreduce-tutorial:step-5 [2012/01/24 22:15]
straka
@@ Line 1: / Line 1: @@
 ====== MapReduce Tutorial : Basic reducer ======
+The interesting part of a MR job is the reducer -- after all mappers produce the (key, value) pairs, for every unique key and all its values a ''reduce'' function is called. The ''reduce'' function can output (key, value) pairs, which are written to disk.
+The ''reduce'' is similar to ''map'', but instead of one value it gets an iterator, which can enumerate all values:
 <file perl reducer.pl>
-#!/usr/bin/perl
 package Mapper;
 use Moose;
 with 'Hadoop::Mapper';
 sub map {
   my ($self, $key, $value, $context) = @_;
   $context->write($key, $value);
 }
+package Reducer;
+use Moose;
+with 'Hadoop::Reducer';
+sub reduce {
+  my ($self, $key, $values, $context) = @_;
+  while ($values->next) {
+    $context->write($key, $values->value);
+  }
+}
 package Main;
 use Hadoop::Runner;
 my $runner = Hadoop::Runner->new(
   mapper => Mapper->new(),
-  input_format => 'TextInputFormat',
+  reducer => Reducer->new());
-  output_format => 'TextOutputFormat',
-  output_compression => 0);
 $runner->run();
 </file>

Institute of Formal and Applied Linguistics Wiki