[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
courses:mapreduce-tutorial:step-7 [2012/01/29 23:52]
majlis
courses:mapreduce-tutorial:step-7 [2012/01/31 12:43]
straka
Line 11: Line 11:
 ===== Using a running cluster ===== ===== Using a running cluster =====
 Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by
-  perl script.pl run -jt cluster_master:9001 ...+  perl script.pl -jt cluster_master:9001 ..
 + 
 +===== Running Hadoop jobs from now on ===== 
 + 
 +From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using ''hadoop-cluster'' for 3h (10800s) and run jobs using ''-jt cluster_master''. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job
  
 ===== Example ===== ===== Example =====
Line 20: Line 24:
   # NOW VIEW THE FILE   # NOW VIEW THE FILE
   # $EDITOR step-7-wordcount.pl   # $EDITOR step-7-wordcount.pl
-  rm -rf step-7-out-sol; perl step-7-wordcount.pl run -jt cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium step-7-out-sol+  rm -rf step-7-out-sol; perl step-7-wordcount.pl -jt cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium step-7-out-sol
   less less step-7-out-sol/part-*   less less step-7-out-sol/part-*
 Remarks: Remarks:

[ Back to the navigation ] [ Back to the content ]