Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-3 [2012/01/25 21:33] straka |
courses:mapreduce-tutorial:step-3 [2012/01/29 15:34] straka |
||
---|---|---|---|
Line 3: | Line 3: | ||
The simplest Hadoop job consists of a mapper only. The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. | The simplest Hadoop job consists of a mapper only. The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. | ||
- | The Hadoop framework handles | + | The Hadoop framework |
===== Example Perl mapper ===== | ===== Example Perl mapper ===== | ||
- | <file perl mapper.pl> | + | <file perl> |
- | # | + | |
package Mapper; | package Mapper; | ||
use Moose; | use Moose; | ||
Line 34: | Line 32: | ||
The values '' | The values '' | ||
- | Resulting script can be executed locally | + | Resulting script can be executed locally |
perl script.pl run input_directory output_directory | perl script.pl run input_directory output_directory | ||
All files in input_directory are processes. The output_directory must not exist. | All files in input_directory are processes. The output_directory must not exist. | ||
+ | |||
===== Exercise ===== | ===== Exercise ===== | ||
- | To check that your Hadoop environment works, try running a MR job on ''/ | + | To check that your Hadoop environment works, try running a MR job on ''/ |
+ | wget --no-check-certificate ' | ||
+ | rm -rf step-3-out-ex; | ||
+ | less step-3-out-ex/ | ||
+ | |||
+ | ==== Solution ==== | ||
+ | You can also download the solution {{: | ||
+ | wget --no-check-certificate ' | ||
+ | rm -rf step-3-out-sol; | ||
+ | less step-3-out-sol/ | ||
+ | |||
+ | ---- | ||
- | {{.:step-3-solution.txt|Solution.pl}} | + | < |
+ | <table style=" | ||
+ | < | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </ | ||
+ | </ | ||
+ | </ |