Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial:step-24 [2012/01/27 22:16]
straka
+++ courses:mapreduce-tutorial:step-24 [2012/01/27 22:16]
straka
@@ Line 95: / Line 95: @@
   /net/projects/hadoop/bin/hadoop -r 0 MapperOnlyHadoopJob.jar /home/straka/wiki/cs-text-small outdir
-Mind the ''-r 0'' switch -- specifying ''-r 0'' disable the reducers. If the switch ''-r 0'' was not given, one default reducer ''IdentityReducer'' would be used. The ''IdentityReducer'' outputs every (key, value) pair it is given.
+Mind the ''-r 0'' switch -- specifying ''-r 0'' disable the reducer. If the switch ''-r 0'' was not given, one default reducer ''IdentityReducer'' would be used. The ''IdentityReducer'' outputs every (key, value) pair it is given.
   * When using ''-r 0', the job runs faster, as the mappers write the output directly to disk. Buth there are as many output files as mappers and the (key, value) pairs are stored in no special order.
   * When not specifying ''-r 0'' (i.e., using ''-r 1'' with ''IdentityReducer''), the job produces the same (key, value) pairs. But this time they are in one output file, sorted by the key. Of course, the job runs slower in this case.

Institute of Formal and Applied Linguistics Wiki