[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:running-jobs [2012/02/05 19:37]
straka vytvořeno
courses:mapreduce-tutorial:running-jobs [2012/02/05 21:15]
straka
Line 1: Line 1:
-====== MapReduce Tutorial : ======+====== MapReduce Tutorial : Running jobs ====== 
 + 
 +The input of a Hadoop job is either a file, or a directory. In latter case all files in the directory are processed. 
 + 
 +The output of a Hadoop job must be a directory, which does not exist. 
 + 
 +===== Run Perl jobs ===== 
 +Choosing mode of operation: 
 +| ^ Command ^ 
 +^ Run locally | ''perl script.pl input output''
 +^ Run using specified jobtracker | ''perl script.pl -jt jobtracker:port input output''
 +^ Run job in dedicated cluster | ''perl script.pl -c number_of_machines input output''
 +^ Run job in dedicated cluster and after it finishes, \\ wait for //W// seconds before stopping the cluster | ''perl script.pl -c number_of_machines -w W_seconds input output''
 + 
 +Specifying number of mappers and reducers: 
 +| ^ Command ^ 
 +^ Run using //R// reducers \\ (//R//>1 not working when running locally)| ''perl -r R script.pl input output''
 +^ Run using //M// mappers | ''perl script.pl `/net/projects/hadoop/bin/compute-splitsize input M` input output''
 + 
 +===== Run Java jobs ===== 
 +Choosing mode of operation: 
 +| ^ Command ^ 
 +^ Run locally | ''/net/projects/hadoop/bin/hadoop job.jar input output''
 +^ Run using specified jobtracker | ''/net/projects/hadoop/bin/hadoop job.jar -jt jobtracker:port input output''
 +^ Run job in dedicated cluster | ''/net/projects/hadoop/bin/hadoop job.jar -c number_of_machines input output''
 +^ Run job in dedicated cluster and after it finishes, \\ wait for //W// seconds before stopping the cluster | ''/net/projects/hadoop/bin/hadoop job.jar -c number_of_machines -w W_seconds input output''
 + 
 +Specifying number of mappers and reducers: 
 +| ^ Command ^ 
 +^ Run using //R// reducers \\ (//R//>1 not working when running locally)| ''perl -r R script.pl input output''
 +^ Run using //M// mappers | ''/net/projects/hadoop/bin/hadoop job.jar `/net/projects/hadoop/bin/compute-splitsize input M` input output''

[ Back to the navigation ] [ Back to the content ]