Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| courses:mapreduce-tutorial:running-jobs [2012/02/05 21:05] straka | courses:mapreduce-tutorial:running-jobs [2013/02/08 14:33] (current) popel Milan improved our Hadoop | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== MapReduce Tutorial : Running jobs ====== | ====== MapReduce Tutorial : Running jobs ====== | ||
| - | ===== Run Perl jobs ===== | + | The input of a Hadoop job is either a file, or a directory. In latter case all files in the directory are processed. | 
| + | |||
| + | The output of a Hadoop job must be a directory, which does not exist. | ||
| + | |||
| + | ===== Running | ||
| | ^ Command ^ | | ^ Command ^ | ||
| - | ^ Run locally | + | ^ Run Perl script '' | 
| + | ^ Run Java job '' | ||
| + | |||
| + | The options are the same for Perl and java: | ||
| + | |||
| + | | ^ Options ^ | ||
| + | ^ Run locally | '' | ||
| + | ^ Run using specified jobtracker | '' | ||
| + | ^ Run job in dedicated cluster | '' | ||
| + | ^ Run job in dedicated cluster and after it finishes, \\ wait for //W// seconds before stopping the cluster | '' | ||
| + | ^ Run using //R// reducers \\ (//R//>1 not working when running locally)| '' | ||
| + | ^ Run using //M// mappers | '' | ||
| + | |||
| + | From February 2012, using the parameter '' | ||
| + | |||
| + | ===== Running multiple jobs ===== | ||
| + | There are several ways of running multiple jobs: | ||
| + | * Java only: Create multiple '' | ||
| + | * Create a cluster using ''/ | ||
| + | * Create a shell script running multiple jobs using '' | ||
