[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Table of Contents

MapReduce Tutorial : Setting the environment

Hadoop installation

The tutorial expects you to be logged to a computer in the UFAL cluster. In this environment, Hadoop is installed in /SGE/HADOOP/active.

You can go through the tutorial even without being connected to UFAL cluster, but you will need

When using local Hadoop installation, you must run all jobs either locally in a single thread or start a local cluster and use -jt for the jobs to use it (see using-a-running-cluster).

The Perl API

To use the Perl MapReduce API, you need

The Moose package

The standard Moose package is available in the UFAL environment, just add

. /net/work/projects/perl_repo/admin/bin/setup_platform

to .profile or .bashrc or type it in the shell

echo -e "\n#MR Tutorial - Moose" >> ~/.bashrc
echo ". /net/work/projects/perl_repo/admin/bin/setup_platform" >> ~/.bashrc

The Hadoop package

The custom Hadoop package is available in /net/projects/hadoop/perl, just add

export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/"
export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl"

to .profile, .bash_profile, .bashrc or type it in the shell.

echo -e "\n#MR Tutorial - Hadoop" >> ~/.bashrc
echo 'export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/"' >> ~/.bashrc
echo 'export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl"' >> ~/.bashrc

Overview Step 2: Input and output format, testing data.


[ Back to the navigation ] [ Back to the content ]