[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki

[ Back to the navigation ]

Table of Contents

MapReduce Tutorial : Managing a Hadoop cluster

Hadoop clusters can be created and stopped dynamically, using the SGE cluster. A Hadoop cluster consists of one jobtracker (master of the cluster) and multiple tasktrackers. The cluster is identified by its jobtracker. The jobtracker listens on two ports – one is used to submit jobs and the other is a web interface.

A Hadoop cluster can be created:

When a Hadoop cluster is about to start, a job is submitted to SGE cluster. When the cluster starts successfully, the jobtracker:port and the address of the web interface is printed, and 3 files are created in the current directory:

A Hadoop cluster is stopped:

Web interface

The web interface provides a lot of useful information:

Killing running jobs

Jobs running in a cluster can be stopped using

/SGE/HADOOP/active/bin/hadoop -jt jobtracker:port -kill hadoop-job-id

The jobs running on a cluster are present in the web interface, or can be printed using

/SGE/HADOOP/active/bin/hadoop -jt jobtracker:port -list

[ Back to the navigation ] [ Back to the content ]