Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial:step-7 [2012/01/29 23:52]
majlis
+++ courses:mapreduce-tutorial:step-7 [2012/01/31 12:43]
straka
@@ Line 11: / Line 11: @@
 ===== Using a running cluster =====
 Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by
-  perl script.pl run -jt cluster_master:9001 ...
+  perl script.pl -jt cluster_master:9001 ...
+===== Running Hadoop jobs from now on =====
+From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using ''hadoop-cluster'' for 3h (10800s) and run jobs using ''-jt cluster_master''. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.
 ===== Example =====
@@ Line 20: / Line 24: @@
   # NOW VIEW THE FILE
   # $EDITOR step-7-wordcount.pl
-  rm -rf step-7-out-sol; perl step-7-wordcount.pl run -jt cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium step-7-out-sol
+  rm -rf step-7-out-sol; perl step-7-wordcount.pl -jt cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium step-7-out-sol
   less less step-7-out-sol/part-*
 Remarks:

Institute of Formal and Applied Linguistics Wiki