Differences
This shows you the differences between two versions of the page.
Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
courses:mapreduce-tutorial:step-24 [2012/01/27 21:01] straka |
courses:mapreduce-tutorial:step-24 [2012/01/27 21:27] straka |
Download the source and compile it. | Download the source and compile it. |
| |
The //official// way of running Hadoop jobs is using ''//SGE/HADOOP/active/bin/hadoop'' script. This script has no user-friendly options and only Hadoop properties can be set. Therefore a wrapper script is provided. This script has the same options as the Perl API runner: | The //official// way of running Hadoop jobs is to use the ''/SGE/HADOOP/active/bin/hadoop'' script. Jobs submitted through this script can be configured using Hadoop properties only. Therefore a wrapper script is provided, with similar options as the Perl API runner: |
* ''net/projects/hadoop/bin/hadoop job.jar input_path output_path'' executes teh given job locally in a single thread. It is useful for debugging. | * ''net/projects/hadoop/bin/hadoop [-r number_of_reducers] job.jar input_path output_path'' executes the given job locally in a single thread. It is useful for debugging. |
| * ''net/projects/hadoop/bin/hadoop -jt cluster_master [-r number_of_reducers] job.jar input_path output_path'' submits the job to given ''cluster_master''. |
| * ''net/projects/hadoop/bin/hadoop -c number_of_machines [-w secs_to_wait_after_job_finishes] [-r number_of_reducers] job.jar input_path output_path'' creates a new cluster with specified number of machines, which executes given job, and then waits for specified number of seconds before it stops. |
| |