Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
courses:mapreduce-tutorial:step-7 [2012/01/25 00:51] straka |
courses:mapreduce-tutorial:step-7 [2013/02/08 14:36] (current) popel Milan improved our Hadoop |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== MapReduce Tutorial : Dynamic Hadoop cluster for several computations ====== | ====== MapReduce Tutorial : Dynamic Hadoop cluster for several computations ====== | ||
- | When multiple | + | When multiple |
A cluster can be created using | A cluster can be created using | ||
- | /home/straka/ | + | /net/projects/ |
The syntax is the same as in '' | The syntax is the same as in '' | ||
Line 10: | Line 10: | ||
===== Using a running cluster ===== | ===== Using a running cluster ===== | ||
- | Running cluster is identified by its master. When running a Perl MR job, existing cluster can be used by | + | Running cluster is identified by its master. When running a Hadoop |
- | perl script.pl | + | perl script.pl -jt cluster_master:9001 ... |
+ | |||
+ | ===== Running Hadoop jobs from now on ===== | ||
+ | |||
+ | From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using '' | ||
===== Example ===== | ===== Example ===== | ||
- | Try running the same script {{: | + | Try running the same script {{: |
- | /home/straka/ | + | |
- | perl wordcount.pl -jt hostname_of_cluster_master:9001 -Dmapred.max.split.size=1000000 / | + | / |
+ | | ||
+ | # $EDITOR step-7-wordcount.pl | ||
+ | rm -rf step-7-out-sol; | ||
+ | less less step-7-out-sol/ | ||
+ | Remarks: | ||
+ | * The reducers seem to start running before the mappers finish. In the web interface, the running time of reducers is divided into thirds: | ||
+ | * during the first 33%, the mapper outputs are copied to the machine where reducer runs. | ||
+ | * during the second 33%, the (key, value) pairs are sorted. | ||
+ | * during the last 33%, the user-defined reducer runs. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | < | ||
+ | <table style=" | ||
+ | < | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </ | ||
+ | </ | ||
+ | </ |