[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-24 [2012/01/30 15:38]
majlis
courses:mapreduce-tutorial:step-24 [2012/01/31 09:52]
straka Change Java commandline syntax.
Line 88: Line 88:
 ===== Running the job ===== ===== Running the job =====
 The official way of running Hadoop jobs is to use the ''/SGE/HADOOP/active/bin/hadoop'' script. Jobs submitted through this script can be configured using Hadoop properties only. Therefore a wrapper script is provided, with similar options as the Perl API runner: The official way of running Hadoop jobs is to use the ''/SGE/HADOOP/active/bin/hadoop'' script. Jobs submitted through this script can be configured using Hadoop properties only. Therefore a wrapper script is provided, with similar options as the Perl API runner:
-  * ''net/projects/hadoop/bin/hadoop [-r number_of_reducers] job.jar [generic Hadoop propertiesinput_path output_path'' -- executes the given job locally in a single thread. It is useful for debugging. +  * ''net/projects/hadoop/bin/hadoop job.jar [-Dname=value -Dname=value ...input output_path'' -- executes the given job locally in a single thread. It is useful for debugging. 
-  * ''net/projects/hadoop/bin/hadoop -jt cluster_master [-r number_of_reducers] job.jar [generic Hadoop propertiesinput_path output_path'' -- submits the job to given ''cluster_master''+  * ''net/projects/hadoop/bin/hadoop job.jar -jt cluster_master [-r number_of_reducers] [-Dname=value -Dname=value ...input output_path'' -- submits the job to given ''cluster_master''
-  * ''net/projects/hadoop/bin/hadoop -c number_of_machines [-w secs_to_wait_after_job_finishes] [-r number_of_reducers] job.jar [generic Hadoop propertiesinput_path output_path'' -- creates a new cluster with specified number of machines, which executes given job, and then waits for specified number of seconds before it stops.+  * ''net/projects/hadoop/bin/hadoop job.jar -c number_of_machines [-w secs_to_wait_after_job_finishes] [-r number_of_reducers] [-Dname=value -Dname=value ...input output_path'' -- creates a new cluster with specified number of machines, which executes given job, and then waits for specified number of seconds before it stops.
  
 ===== Exercise ===== ===== Exercise =====
Line 96: Line 96:
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_export/code/courses:mapreduce-tutorial:step-24?codeblock=1' -O 'MapperOnlyHadoopJob.java'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_export/code/courses:mapreduce-tutorial:step-24?codeblock=1' -O 'MapperOnlyHadoopJob.java'
   make -f /net/projects/hadoop/java/Makefile MapperOnlyHadoopJob.jar   make -f /net/projects/hadoop/java/Makefile MapperOnlyHadoopJob.jar
-  rm -rf step-24-out-sol; /net/projects/hadoop/bin/hadoop -r 0 MapperOnlyHadoopJob.jar /home/straka/wiki/cs-text-small step-24-out-sol+  rm -rf step-24-out-sol; /net/projects/hadoop/bin/hadoop MapperOnlyHadoopJob.jar -r 0 /home/straka/wiki/cs-text-small step-24-out-sol
   less step-24-out-sol/part-*   less step-24-out-sol/part-*
  

[ Back to the navigation ] [ Back to the content ]