Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-6 [2012/01/24 23:51] straka |
courses:mapreduce-tutorial:step-6 [2012/01/25 00:13] straka |
||
---|---|---|---|
Line 5: | Line 5: | ||
So far all our MR jobs were executed locally. But all of them can be executed on multiple machines. It suffices to add parameter '' | So far all our MR jobs were executed locally. But all of them can be executed on multiple machines. It suffices to add parameter '' | ||
perl script.pl run -c number_of_machines [-w sec_to_wait_after_job_completion] input_directory output_directory | perl script.pl run -c number_of_machines [-w sec_to_wait_after_job_completion] input_directory output_directory | ||
- | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the status of the computation, | + | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the status of the computation |
+ | |||
+ | When a distributed MR computations is executed, it submits a job to SGE cluster, with the name of the Perl script. The SGE cluster creates 3 files in current directory: | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | When the computation ends and is waiting because of the '' | ||
+ | |||
+ | ===== Web interface ===== | ||
+ | |||
+ | The cluster master provides a web interface on port 50030 (the port may change in the future). The cluster master address can be found at the first line of '' | ||
+ | |||
+ | The web interface provides a lot of useful informations: | ||
+ | * running, failed and successfully completed jobs | ||
+ | * for running job, current progress and counters of the whole job and also of each mapper and reducer is available | ||
+ | * for any job, the counters and outputs of all mappers and reducers | ||
+ | * for any job, all Hadoop settings | ||