[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-7 [2012/01/25 00:51]
straka
courses:mapreduce-tutorial:step-7 [2012/01/26 23:11]
straka
Line 1: Line 1:
 ====== MapReduce Tutorial : Dynamic Hadoop cluster for several computations ====== ====== MapReduce Tutorial : Dynamic Hadoop cluster for several computations ======
  
-When multiple MR jobs should be executed, it would be better to reuse the cluster instead of allocating a new one for every computation.+When multiple Hadoop jobs should be executed, it is better to reuse the cluster instead of allocating a new one for every computation.
  
 A cluster can be created using A cluster can be created using
-  /home/straka/hadoop/bin/hadoop-cluster -c number_of_machines -w sec_to_run_the_cluster_for+  /net/projects/hadoop/bin/hadoop-cluster -c number_of_machines -w sec_to_run_the_cluster_for
 The syntax is the same as in ''perl script.pl run''. The syntax is the same as in ''perl script.pl run''.
  
Line 10: Line 10:
  
 ===== Using a running cluster ===== ===== Using a running cluster =====
-Running cluster is identified by its master. When running a Perl MR job, existing cluster can be used by +Running cluster is identified by its master. When running a Hadoop job using Perl API, existing cluster can be used by 
-  perl script.pl run -jt hostname_of_cluster_master:9001 ... +  perl script.pl run -jt cluster_master:9001 ...
  
 ===== Example ===== ===== Example =====
  
-Try running the same script {{:courses:mapreduce-tutorial:step-6.txt|wordcount.pl}} as in the last step, by creating the cluster and submitting the job to it: +Try running the same script {{:courses:mapreduce-tutorial:step-6.txt|wordcount.pl}} as in the last step, this time by creating the cluster and submitting the job to it: 
-  /home/straka/hadoop/bin/hadoop-cluster -c 1 -w 7200 +  /net/projects/hadoop/bin/hadoop-cluster -c 1 -w 600 
-  perl wordcount.pl -jt hostname_of_cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium some_output_directory+  perl wordcount.pl run -jt cluster_master:9001 -Dmapred.max.split.size=1000000 /home/straka/wiki/cs-text-medium some_output_directory 

[ Back to the navigation ] [ Back to the content ]