Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
courses:mapreduce-tutorial:step-6 [2012/01/24 23:51] straka |
courses:mapreduce-tutorial:step-6 [2012/02/06 13:55] (current) straka |
||
|---|---|---|---|
| Line 3: | Line 3: | ||
| Probably the most important feature of MapReduce is to run computations distributively. | Probably the most important feature of MapReduce is to run computations distributively. | ||
| - | So far all our MR jobs were executed locally. But all of them can be executed on multiple machines. It suffices to add parameter '' | + | So far all our Hadoop |
| - | perl script.pl | + | perl script.pl -c number_of_machines [-w sec_to_wait_after_job_completion] input_directory output_directory |
| - | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the status of the computation, | + | This commands creates a cluster of specified number of machines. Every machine is able to run two mappers and two reducers simultaneously. In order to be able to observe the counters, |
| + | One of the machines in the cluster is a //master//, or a //job tracker//, and it is used to identify the cluster. | ||
| + | |||
| + | In the UFAL environment, | ||
| + | * '' | ||
| + | * '' | ||
| + | * '' | ||
| + | When the computation ends and is waiting because of the '' | ||
| + | |||
| + | ===== Web interface ===== | ||
| + | |||
| + | The cluster master provides a web interface on address printed by the '' | ||
| + | |||
| + | The web interface provides a lot of useful information: | ||
| + | * running, failed and successfully completed jobs | ||
| + | * for running job, current progress and counters of the whole job and also of each mapper and reducer is available | ||
| + | * for any job, the counters and outputs of all mappers and reducers | ||
| + | * for any job, all Hadoop settings | ||
| + | |||
| + | |||
| + | |||
| + | ===== Example ===== | ||
| + | |||
| + | Try running the {{: | ||
| + | wget --no-check-certificate ' | ||
| + | rm -rf step-6-out; perl step-6-wordcount.pl -c 1 -w 600 -Dmapred.max.split.size=1000000 / | ||
| + | and explore the web interface. | ||
| + | |||
| + | If you cannot access directly the '' | ||
| + | ssh -N -L 50030: | ||
| + | on your computer to create a tunnel from local port 50030 to machine '' | ||
| + | |||
| + | ---- | ||
| + | |||
| + | < | ||
| + | <table style=" | ||
| + | <tr> | ||
| + | <td style=" | ||
| + | <td style=" | ||
| + | <td style=" | ||
| + | </tr> | ||
| + | </ | ||
| + | </ | ||
