Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| courses:mapreduce-tutorial:step-7 [2012/01/29 20:51] straka | courses:mapreduce-tutorial:step-7 [2013/02/08 14:36] (current) popel Milan improved our Hadoop | ||
|---|---|---|---|
| Line 4: | Line 4: | ||
| A cluster can be created using | A cluster can be created using | ||
| - | / | + | / | 
| The syntax is the same as in '' | The syntax is the same as in '' | ||
| Line 11: | Line 11: | ||
| ===== Using a running cluster ===== | ===== Using a running cluster ===== | ||
| Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by | Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by | ||
| - | perl script.pl | + | perl script.pl -jt cluster_master: | 
| + | |||
| + | ===== Running Hadoop jobs from now on ===== | ||
| + | |||
| + | From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using '' | ||
| ===== Example ===== | ===== Example ===== | ||
| Line 18: | Line 22: | ||
| wget --no-check-certificate ' | wget --no-check-certificate ' | ||
| / | / | ||
| - | rm -rf step-7-out-sol; | + |  | 
| + | # $EDITOR step-7-wordcount.pl | ||
| + |  | ||
| less less step-7-out-sol/ | less less step-7-out-sol/ | ||
| Remarks: | Remarks: | ||
| - | * The reducers seem to start running before the mappers | + | * The reducers seem to start running before the mappers | 
| + | * during the first 33%, the mapper outputs are copied | ||
| + | * during the second 33%, the (key, value) pairs are sorted. | ||
| + | * during the last 33%, the user-defined reducer runs. | ||
| ---- | ---- | ||
