[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:mapreduce-tutorial:step-1 [2012/01/23 20:56]
straka
courses:mapreduce-tutorial:step-1 [2012/01/30 15:25] (current)
straka
Line 1: Line 1:
-====== MapReduce Tutorial - Set the environment ======+====== MapReduce Tutorial : Setting the environment ====== 
 + 
 +===== Requirements ===== 
 + 
 +The tutorial expects you to be logged to a computer in the UFAL cluster and be able to submit jobs using SGE. In this environment, Hadoop is installed in ''/SGE/HADOOP/active''
 + 
 +To use the Perl MapReduce API, you need 
 + 
 +  * Perl package ''Moose''
 +  * Perl package ''Hadoop''
 + 
 +==== The Moose package ==== 
 +The standard Moose package is available in the UFAL environment, just add 
 +  . /net/work/projects/perl_repo/admin/bin/setup_platform 
 +to ''.profile'' or ''.bashrc'' or type it in the shell 
 + 
 +  echo -e "\n#MR Tutorial - Moose" >> ~/.bashrc 
 +  echo ". /net/work/projects/perl_repo/admin/bin/setup_platform" >> ~/.bashrc 
 + 
 + 
 +==== The Hadoop package ==== 
 +The custom Hadoop package is available in ''/net/projects/hadoop/perl'', just add 
 +  export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/" 
 +  export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl" 
 +to ''.profile'', ''.bash_profile'', ''.bashrc'' or type it in the shell. 
 + 
 +  echo -e "\n#MR Tutorial - Hadoop" >> ~/.bashrc 
 +  echo 'export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/"' >> ~/.bashrc 
 +  echo 'export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl"' >> ~/.bashrc 
 + 
 +===== When not logged in UFAL cluster ===== 
 + 
 +**If you are not logged in the UFAL cluster, you will need:** 
 +  * local Hadoop installation 
 +    - download ''http://www.apache.org/dist/hadoop/common/hadoop-1.0.0/hadoop-1.0.0.tar.gz'' 
 +    - unpack it 
 +    - edit ''conf/hadoop-env.sh'' file and make sure there is valid line <code>export JAVA_HOME=/path/to/your/jdk</code> 
 +  * the repository ''hadoop'' containing the Perl API and Java extensions. 
 +  * when using Perl API, set ''hadoop_prefix'' to point to your Hadoop installation 
 +  * when using Java API, one of the ''Makefile''s contain absolute path to the ''hadoop'' repository -- please correct it 
 +When using local Hadoop installation, you must run all jobs either locally in a single thread or start a local cluster and use ''-jt'' for the jobs to use it (see [[.:step-7#using-a-running-cluster]]). 
 + 
 +---- 
 + 
 +<html> 
 +<table style="width:100%"> 
 +<tr> 
 +<td style="text-align:left; width: 33%; "></html><html></td> 
 +<td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td> 
 +<td style="text-align:right; width: 33%; "></html>[[step-2|Step 2]]: Input and output format, testing data.<html></td> 
 +</tr> 
 +</table> 
 +</html>

[ Back to the navigation ] [ Back to the content ]