[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Table of Contents

MapReduce Tutorial

Materials

Day 1

Today we will be using the Perl API (there is no need to study it now, the tutorial will explain it).

Environment

MapReduce basics

Controlling the cluster

From now on, run all examples using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.

MapReduce extended

Multiple reducers + Partitions
Mappers, splits
Hadoop properties
Combiners
setup, cleanup, perl inplace
Work dir

N-grams
K-means and Iterations

Other


[ Back to the navigation ] [ Back to the content ]