Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial:managing-a-hadoop-cluster [2012/02/05 19:51]
straka
+++ courses:mapreduce-tutorial:managing-a-hadoop-cluster [2013/02/08 15:25] (current)
popel
@@ Line 5: / Line 5: @@
 A Hadoop cluster can be created:
   * for a specific Hadoop job. This is done by executing the job with the ''-c'' option, see [[.:Running jobs]].
-  * manually using ''/net/projects/hadoop/bin/hadoop-cluster'' script:
+  * manually using ''/net/projects/hadoop/bin/hadoop-cluster'' script: <code>/net/projects/hadoop/bin/hadoop-cluster -c number_of_machines -w seconds_to_wait_after_all_jobs_completed</code>
-  /net/projects/hadoop/bin/hadoop-cluster -c number_of_machines -w seconds_until_cluster_terminates
-When a Hadoop cluster starts, it submits a job to SGE cluster. The job creates 3 files in the current directory:
+When a Hadoop cluster is about to start, a job is submitted to SGE cluster. When the cluster starts successfully, the jobtracker:port and the address of the web interface is printed, and 3 files are created in the current directory:
-  * ''HadoopCluster.c$SGE_JOBID'' -- high-level status of the Hadoop computation
+  * ''HadoopCluster.c$SGE_JOBID'' -- high-level status of the Hadoop computation. Contains both the jobtracker:port and the address of the web interface.
-  * ''HadoopCluster.o$SGE_JOBID'' -- contains stdout and stderr of the Hadoop job
+  * ''HadoopCluster.o$SGE_JOBID'' -- contains stdout and stderr of the Hadoop job.
-  * ''HadoopCluster.po$SGE_JOBID'' -- contains stdout and stderr of the Hadoop cluster
+  * ''HadoopCluster.po$SGE_JOBID'' -- contains stdout and stderr of the Hadoop cluster.
 A Hadoop cluster is stopped:
-  * after the timeout specified by ''-w''
+  * after the timeout specified by ''-w'' after the last task is finished
   * when the ''HadoopCluster.c$SGE_JOBID'' file is deleted
   * using ''qdel''.
-===== Controlling the cluster =====
+===== Web interface =====
+The web interface provides a lot of useful information:
+  * running, failed and successfully completed jobs
+  * for running job, current progress and counters of the whole job and also of each mapper and reducer is available
+  * for any job, the counters and outputs of all mappers and reducers
+  * for any job, all Hadoop settings
+===== Killing running jobs =====
+Jobs running in a cluster can be stopped using
+<code>/SGE/HADOOP/active/bin/hadoop -jt jobtracker:port -kill hadoop-job-id</code>
+The jobs running on a cluster are present in the web interface, or can be printed using
+<code>/SGE/HADOOP/active/bin/hadoop -jt jobtracker:port -list</code>

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences