[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-3 [2012/01/24 19:03]
straka vytvořeno
courses:mapreduce-tutorial:step-3 [2012/01/28 12:03]
majlis
Line 1: Line 1:
-====== MapReduce Tutorial : ======+====== MapReduce Tutorial : Basic mapper ====== 
 + 
 +The simplest Hadoop job consists of a mapper only.  The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. 
 + 
 +The Hadoop framework silently handles failures. If a mapper task fails, another is executed and the input of the failed attempt is discarded. 
 + 
 +===== Example Perl mapper ===== 
 + 
 +<file perl> 
 +#!/usr/bin/perl 
 + 
 +package Mapper; 
 +use Moose; 
 +with 'Hadoop::Mapper'; 
 + 
 +sub map { 
 +  my ($self, $key, $value, $context) = @_; 
 + 
 +  $context->write($key, $value); 
 +
 + 
 +package Main; 
 +use Hadoop::Runner; 
 + 
 +my $runner = Hadoop::Runner->new( 
 +  mapper => Mapper->new(), 
 +  input_format => 'TextInputFormat', 
 +  output_format => 'TextOutputFormat', 
 +  output_compression => 0); 
 + 
 +$runner->run(); 
 +</file> 
 + 
 +The values ''input_format'', ''output_format'' and ''output_compression'' could be left out, because they are all set to their default value. 
 + 
 +Resulting script can be executed locally in a single thread using 
 +  perl script.pl run input_directory output_directory 
 +All files in input_directory are processes. The output_directory must not exist. 
 + 
 + 
 +---- 
 + 
 +<html> 
 +<table style="width:100%"> 
 +<tr> 
 +<td style="text-align:left; width: 33%; "></html>[[step-2|Step 2]]: Input and output format, testing data.<html></td> 
 +<td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td> 
 +<td style="text-align:right; width: 33%; "></html>[[step-4|Step 4]]: Counters.<html></td> 
 +</tr> 
 +</table> 
 +</html>

[ Back to the navigation ] [ Back to the content ]