Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-12 [2012/01/25 21:24] straka |
courses:mapreduce-tutorial:step-12 [2012/01/25 22:15] straka |
||
---|---|---|---|
Line 3: | Line 3: | ||
Sometimes it would be useful to create output files manually in reducers -- either multiple files are needed per reducer, or a specific file format is desired. | Sometimes it would be useful to create output files manually in reducers -- either multiple files are needed per reducer, or a specific file format is desired. | ||
- | Problem is that Hadoop framework can spawn same reducer | + | Problem is that Hadoop framework can spawn several task attempts for the same reducer |
For these reasons Hadoop creates an output directory for every reduce attempt it makes. If the reducer finishes successfully, | For these reasons Hadoop creates an output directory for every reduce attempt it makes. If the reducer finishes successfully, | ||
Line 9: | Line 9: | ||
Both these informations are available in Perl API using environmental variables: | Both these informations are available in Perl API using environmental variables: | ||
* '' | * '' | ||
- | * '' | + | * '' |
+ | |||
+ | ===== Reduce-less jobs ===== | ||
+ | If a MR job runs without reducers, the output of mappers is written to output directory without further processing. In this case, environmental variable '' | ||
+ | |||
+ | ===== Exercise ===== | ||
+ | Change the word counting script {{: | ||
+ | |||
+ | {{: | ||