Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
courses:mapreduce-tutorial:running-jobs [2012/02/05 20:00] straka |
courses:mapreduce-tutorial:running-jobs [2013/02/08 14:33] (current) popel Milan improved our Hadoop |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== MapReduce Tutorial : Running jobs ====== | ====== MapReduce Tutorial : Running jobs ====== | ||
+ | |||
+ | The input of a Hadoop job is either a file, or a directory. In latter case all files in the directory are processed. | ||
+ | |||
+ | The output of a Hadoop job must be a directory, which does not exist. | ||
+ | |||
+ | ===== Running jobs ===== | ||
+ | |||
+ | | ^ Command ^ | ||
+ | ^ Run Perl script '' | ||
+ | ^ Run Java job '' | ||
+ | |||
+ | The options are the same for Perl and java: | ||
+ | |||
+ | | ^ Options ^ | ||
+ | ^ Run locally | '' | ||
+ | ^ Run using specified jobtracker | '' | ||
+ | ^ Run job in dedicated cluster | '' | ||
+ | ^ Run job in dedicated cluster and after it finishes, \\ wait for //W// seconds before stopping the cluster | '' | ||
+ | ^ Run using //R// reducers \\ (//R//>1 not working when running locally)| '' | ||
+ | ^ Run using //M// mappers | '' | ||
+ | |||
+ | From February 2012, using the parameter '' | ||
+ | |||
+ | ===== Running multiple jobs ===== | ||
+ | There are several ways of running multiple jobs: | ||
+ | * Java only: Create multiple '' | ||
+ | * Create a cluster using ''/ | ||
+ | * Create a shell script running multiple jobs using '' |