[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Table of Contents

MapReduce Tutorial

Materials

Day 1

Today we will be using the Perl API (there is no need to study it now, the tutorial will explain it).

Environment

MapReduce basics

Controlling the cluster

MapReduce extended

From now on, it is best to run MR jobs using a one-machine cluster – create a one-machine cluster using hadoop-cluster for 3h (10800s) and run jobs using -jt cluster_master. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.

Advanced MapReduce exercises

Exercises in this section can be made in any order, but it is recommended to try solving all of them. The Perl API reference may come handy.

Day 2

Today we will be using the Java API.

Environment

Java Hadoop basics

Advanced topics

Other


[ Back to the navigation ] [ Back to the content ]