Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-3 [2012/01/24 19:03] straka vytvořeno |
courses:mapreduce-tutorial:step-3 [2012/01/30 00:37] straka Improving package names of Perl programs. |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== MapReduce Tutorial : ====== | + | ====== MapReduce Tutorial : Basic mapper |
+ | |||
+ | The simplest Hadoop job consists of a mapper only. The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. | ||
+ | |||
+ | The Hadoop framework silently handles failures. If a mapper task fails, another is executed and the input of the failed attempt is discarded. | ||
+ | |||
+ | ===== Example Perl mapper ===== | ||
+ | |||
+ | <file perl> | ||
+ | package My:: | ||
+ | use Moose; | ||
+ | with ' | ||
+ | |||
+ | sub map { | ||
+ | my ($self, $key, $value, $context) = @_; | ||
+ | |||
+ | $context-> | ||
+ | } | ||
+ | |||
+ | package main; | ||
+ | use Hadoop:: | ||
+ | |||
+ | my $runner = Hadoop:: | ||
+ | mapper => My:: | ||
+ | input_format => ' | ||
+ | output_format => ' | ||
+ | output_compression => 0); | ||
+ | |||
+ | $runner-> | ||
+ | </ | ||
+ | |||
+ | The values '' | ||
+ | |||
+ | Resulting script can be executed locally in a single thread using | ||
+ | perl script.pl run input_directory output_directory | ||
+ | All files in input_directory are processes. The output_directory must not exist. | ||
+ | |||
+ | |||
+ | ===== Exercise ===== | ||
+ | |||
+ | To check that your Hadoop environment works, try running a MR job on ''/ | ||
+ | wget --no-check-certificate ' | ||
+ | # NOW EDIT THE FILE | ||
+ | # $EDITOR step-3-exercise.pl | ||
+ | rm -rf step-3-out-ex; | ||
+ | less step-3-out-ex/ | ||
+ | |||
+ | ==== Solution ==== | ||
+ | You can also download the solution {{: | ||
+ | wget --no-check-certificate ' | ||
+ | # NOW VIEW THE FILE | ||
+ | # $EDITOR step-3-solution.pl | ||
+ | rm -rf step-3-out-sol; | ||
+ | less step-3-out-sol/ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | < | ||
+ | <table style=" | ||
+ | < | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </ | ||
+ | </ | ||
+ | </ |