Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-3 [2012/01/24 19:14] straka |
courses:mapreduce-tutorial:step-3 [2012/01/24 21:10] straka |
||
---|---|---|---|
Line 3: | Line 3: | ||
The simplest MR job consists of a mapper only. The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. | The simplest MR job consists of a mapper only. The input data is divided in several parts, every processed by an independent mapper, and the results are collected in one directory, one file per mapper. | ||
- | ===== Example | + | ===== Example |
- | <code perl> | + | <file perl mapper.pl> |
# | # | ||
Line 28: | Line 28: | ||
$runner-> | $runner-> | ||
- | </code> | + | </file> |
The values '' | The values '' | ||
- | Resulting script can be executed using | + | Resulting script can be executed |
perl script.pl run input_directory output_directory | perl script.pl run input_directory output_directory | ||
- | |||
All files in input_directory are processes. The output_directory must not exist. | All files in input_directory are processes. The output_directory must not exist. | ||
+ | |||
+ | ===== Exercise ===== | ||
+ | |||
+ | To check that your Hadoop environment works, try running a MR job on ''/ | ||
+ | |||
+ | {{.: |