[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:running-jobs [2012/02/05 20:00]
straka
courses:mapreduce-tutorial:running-jobs [2012/02/05 21:14]
straka
Line 1: Line 1:
 ====== MapReduce Tutorial : Running jobs ====== ====== MapReduce Tutorial : Running jobs ======
 +
 +The input of a Hadoop job is either a file, or a directory. In latter case all files in the directory are processed.
 +
 +The output of a Hadoop job must be a directory, which does not exist.
 +
 +===== Run Perl jobs =====
 +Choosing mode of operation:
 +| ^ Command ^
 +^ Run locally | ''perl script.pl input output'' |
 +^ Run using specified jobtracker | ''perl script.pl -jt jobtracker:port input output'' |
 +^ Run job in dedicated cluster | ''perl script.pl -c number_of_machines input output'' |
 +^ Run job in dedicated cluster and after it finishes, \\ wait for //W// seconds before stopping the cluster | ''perl script.pl -c number_of_machines -w W_seconds input output'' |
 +
 +Specifying number of mappers and reducers:
 +| ^ Command ^
 +^ Run using //R// reducers \\ (//R//>1 not working when running locally)| ''perl -r R script.pl input output'' |
 +^ Run using //M// mappers | ''perl script.pl `/net/projects/hadoop/bin/compute-splitsize input M` input output'' |

[ Back to the navigation ] [ Back to the content ]