[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
courses:mapreduce-tutorial:step-1 [2012/01/24 19:24]
straka
courses:mapreduce-tutorial:step-1 [2012/01/30 14:47]
straka
Line 1: Line 1:
 ====== MapReduce Tutorial : Setting the environment ====== ====== MapReduce Tutorial : Setting the environment ======
 +
 +===== Hadoop installation =====
 +
 +The tutorial expects you to be logged to a computer in the UFAL cluster. In this environment, Hadoop is installed in ''/SGE/HADOOP/active''.
 +
 +===== The Perl API =====
  
 To use the Perl MapReduce API, you need To use the Perl MapReduce API, you need
-The input_f 
-  * Perl package ''Moose''. This is available in the UFAL environment, just add '' 
  
 +  * Perl package ''Moose''.
 +  * Perl package ''Hadoop''.
 +
 +==== The Moose package ====
 +The standard Moose package is available in the UFAL environment, just add
 +  . /net/work/projects/perl_repo/admin/bin/setup_platform
 +to ''.profile'' or ''.bashrc'' or type it in the shell
 +
 +  echo -e "\n#MR Tutorial - Moose" >> ~/.bashrc
 +  echo ". /net/work/projects/perl_repo/admin/bin/setup_platform" >> ~/.bashrc
 +
 +
 +==== The Hadoop package ====
 +The custom Hadoop package is available in ''/net/projects/hadoop/perl'', just add
 +  export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/"
 +  export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl"
 +to ''.profile'', ''.bash_profile'', ''.bashrc'' or type it in the shell.
 +
 +
 +  echo -e "\n#MR Tutorial - Hadoop" >> ~/.bashrc
 +  echo 'export PERLLIB="$PERLLIB:/net/projects/hadoop/perl/"' >> ~/.bashrc
 +  echo 'export PERL5LIB="$PERL5LIB:/net/projects/hadoop/perl"' >> ~/.bashrc
 +
 +===== When not logged in UFAL cluster =====
 +
 +**If you are not logged in the UFAL cluster, you will need:**
 +  * local Hadoop installation
 +    - download ''http://www.apache.org/dist/hadoop/common/hadoop-1.0.0/hadoop-1.0.0.tar.gz''
 +    - unpack it
 +    - edit ''conf/hadoop-env.sh'' file and make sure there is valid line <code>export JAVA_HOME=/path/to/your/jdk</code>
 +  * the repository ''hadoop'' containing the Perl API and Java extensions.
 +  * when using Perl API, set ''hadoop_prefix'' to point to your Hadoop installation
 +  * when using Java API, one of the ''Makefile''s contain absolute path to the ''hadoop'' repository -- please correct it
 +When using local Hadoop installation, you must run all jobs either locally in a single thread or start a local cluster and use ''-jt'' for the jobs to use it (see [[.:step-7#using-a-running-cluster]]).
 +
 +----
 +
 +<html>
 +<table style="width:100%">
 +<tr>
 +<td style="text-align:left; width: 33%; "></html><html></td>
 +<td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td>
 +<td style="text-align:right; width: 33%; "></html>[[step-2|Step 2]]: Input and output format, testing data.<html></td>
 +</tr>
 +</table>
 +</html>

[ Back to the navigation ] [ Back to the content ]