Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial [2012/01/13 11:09]
majlis vytvořeno
+++ courses:mapreduce-tutorial [2012/01/25 21:10]
straka
@@ Line 4: / Line 4: @@
 ===== Materials =====
-  * [[.:mapreduce:Introduction]]
+  * [[.:mapreduce-tutorial:Introduction]]
-===== Other ====
+===== Day 1 =====
+Today we will be using the [[.:mapreduce-tutorial:Perl API]] (there is no need to study it now, the tutorial will explain it).
+=== Environment ===
+  * [[.:mapreduce-tutorial:Step 1]]: Setting the environment.
+=== MapReduce basics ===
+  * [[.:mapreduce-tutorial:Step 2]]: Input and output format, testing data.
+  * [[.:mapreduce-tutorial:Step 3]]: Basic mapper.
+  * [[.:mapreduce-tutorial:Step 4]]: Counters.
+  * [[.:mapreduce-tutorial:Step 5]]: Basic reducer.
+=== Controlling the cluster ===
+  * [[.:mapreduce-tutorial:Step 6]]: Running on cluster.
+  * [[.:mapreduce-tutorial:Step 7]]: Dynamic Hadoop cluster for several computations.
+From now on, it is best to run MR jobs using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.
+=== MapReduce extended ===
+  * [[.:mapreduce-tutorial:Step 8]]: Multiple mappers, reducers and partitioning.
+  * [[.:mapreduce-tutorial:Step 9]]: Hadoop properties.
+  * [[.:mapreduce-tutorial:Step 10]]: Combiners.
+  * [[.:mapreduce-tutorial:Step 11]]: Initialization and cleanup of MR tasks, performance of combiners.
+  * [[.:mapreduce-tutorial:Step 12]]: Additional output from mappers and reducers.
+=== Advanced MapReduce exercises ===
+  * [[.:mapreduce-tutorial:Step 13]]: Sorting
+  * [[.:mapreduce-tutorial:Step 14]]: N-gram language model
+  * [[.:mapreduce-tutorial:Step 15]]: K-means algorithm
+===== Other =====
   * [[user:majlis:hadoop|Further information]]

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences