This is an old revision of the document!
Table of Contents
MapReduce Tutorial
- Part 1: Monday January 30, 14:00-17:00, lab SU2
- Part 2: Tuesday January 31, 14:00-17:00, lab SU2
Materials
Day 1
Today we will be using the Perl API (there is no need to study it now, the tutorial will explain it).
Environment
- Step 1: Setting the environment.
MapReduce basics
Controlling the cluster
From now on, run all examples using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.
MapReduce extended
Setup, cleanup
Multiple reducers + Partitions
Combiners, perl inplace
Work dir
Hadoop properties
N-grams
K-means and Iterations