Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial [2012/01/27 00:44]
straka
+++ courses:mapreduce-tutorial [2012/01/28 20:18]
straka
@@ Line 23: / Line 23: @@
 === MapReduce extended ===
-From now on, it is best to run MR jobs using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.
+From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using ''hadoop-cluster'' for 3h (10800s) and run jobs using ''-jt cluster_master''. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job.
   * [[.:mapreduce-tutorial:Step 8]]: Multiple mappers, reducers and partitioning.
   * [[.:mapreduce-tutorial:Step 9]]: Hadoop properties.
@@ Line 32: / Line 32: @@
 === Advanced MapReduce exercises ===
 Exercises in this section can be made in any order, but it is recommended to try solving all of them. The [[.:mapreduce-tutorial:Perl API|Perl API reference]] may come handy.
-  * [[.:mapreduce-tutorial:Step 13]]: Sorting
+  * [[.:mapreduce-tutorial:Step 13]]: Sorting.
-  * [[.:mapreduce-tutorial:Step 14]]: N-gram language model
+  * [[.:mapreduce-tutorial:Step 14]]: N-gram language model.
-  * [[.:mapreduce-tutorial:Step 15]]: K-means clustering
+  * [[.:mapreduce-tutorial:Step 15]]: K-means clustering.
 ===== Day 2 =====
@@ Line 41: / Line 41: @@
 === Environment ===
-  * [[.:mapreduce-tutorial:Step 21]]: Preparing the environment.
+  * [[.:mapreduce-tutorial:Step 21]]: Preparing the environment
-  * [[.:mapreduce-tutorial:Step 22]]: Optional -- Setting Eclipse.
+  * [[.:mapreduce-tutorial:Step 22]]: Optional -- Setting Eclipse
 === Java Hadoop basics ====
-  * [[.:mapreduce-tutorial:Step 23]]: Predefined formats and types.
+  * [[.:mapreduce-tutorial:Step 23]]: Predefined formats and types
-  * [[.:mapreduce-tutorial:Step 24]]: Mappers, running Java Hadoop jobs.
+  * [[.:mapreduce-tutorial:Step 24]]: Mappers, running Java Hadoop jobs
-  * [[.:mapreduce-tutorial:Step 25]]: Reducers, combiners and partitioners.
+  * [[.:mapreduce-tutorial:Step 25]]: Reducers, combiners and partitioners
-  * [[.:mapreduce-tutorial:Step 26]]: Counters, compression.
+  * [[.:mapreduce-tutorial:Step 26]]: Counters and job configuration
-  * [[.:mapreduce-tutorial:Step 27]]: Reusing Mapper and Reducer code.
-=== Exercises ===
 === Advanced topics ===
-  * Custom input format -- WholeFile and WholeFileAsPath
+  * [[.:mapreduce-tutorial:Step 27]]: Custom data types
-  * Custom data type -- Pair<A, B>
+  * [[.:mapreduce-tutorial:Step 28]]: Custom input formats
+  * [[.:mapreduce-tutorial:Step 29]]: Running multiple Hadoop jobs
 ===== Other =====
   * [[user:majlis:hadoop|Further information]]

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences