Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
courses:mapreduce-tutorial:step-6 [2012/01/24 23:51] straka |
courses:mapreduce-tutorial:step-6 [2012/02/06 13:55] (current) straka |
||
---|---|---|---|
Line 3: | Line 3: | ||
Probably the most important feature of MapReduce is to run computations distributively. | Probably the most important feature of MapReduce is to run computations distributively. | ||
- | So far all our MR jobs were executed locally. But all of them can be executed on multiple machines. It suffices to add parameter '' | + | So far all our Hadoop |
- | perl script.pl | + | perl script.pl -c number_of_machines [-w sec_to_wait_after_job_completion] input_directory output_directory |
- | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the status of the computation, | + | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the counters, |
+ | One of the machines in the cluster is a //master//, or a //job tracker//, and it is used to identify the cluster. | ||
+ | |||
+ | In the UFAL environment, | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | When the computation ends and is waiting because of the '' | ||
+ | |||
+ | ===== Web interface ===== | ||
+ | |||
+ | The cluster master provides a web interface on address printed by the '' | ||
+ | |||
+ | The web interface provides a lot of useful information: | ||
+ | * running, failed and successfully completed jobs | ||
+ | * for running job, current progress and counters of the whole job and also of each mapper and reducer is available | ||
+ | * for any job, the counters and outputs of all mappers and reducers | ||
+ | * for any job, all Hadoop settings | ||
+ | |||
+ | |||
+ | |||
+ | ===== Example ===== | ||
+ | |||
+ | Try running the {{: | ||
+ | wget --no-check-certificate ' | ||
+ | rm -rf step-6-out; perl step-6-wordcount.pl -c 1 -w 600 -Dmapred.max.split.size=1000000 / | ||
+ | and explore the web interface. | ||
+ | |||
+ | If you cannot access directly the '' | ||
+ | ssh -N -L 50030: | ||
+ | on your computer to create a tunnel from local port 50030 to machine '' | ||
+ | |||
+ | ---- | ||
+ | |||
+ | < | ||
+ | <table style=" | ||
+ | <tr> | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </tr> | ||
+ | </ | ||
+ | </ |