[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial [2012/01/25 22:03]
straka
courses:mapreduce-tutorial [2012/01/27 00:45]
straka
Line 34: Line 34:
   * [[.:mapreduce-tutorial:Step 13]]: Sorting   * [[.:mapreduce-tutorial:Step 13]]: Sorting
   * [[.:mapreduce-tutorial:Step 14]]: N-gram language model   * [[.:mapreduce-tutorial:Step 14]]: N-gram language model
-  * [[.:mapreduce-tutorial:Step 15]]: K-means algorithm+  * [[.:mapreduce-tutorial:Step 15]]: K-means clustering 
 + 
 +===== Day 2 ===== 
 + 
 +Today we will be using the [[http://hadoop.apache.org/common/docs/r1.0.0/api/index.html|Java API]]. 
 + 
 +=== Environment === 
 +  * [[.:mapreduce-tutorial:Step 21]]: Preparing the environment. 
 +  * [[.:mapreduce-tutorial:Step 22]]: Optional -- Setting Eclipse. 
 + 
 +=== Java Hadoop basics ==== 
 +  * [[.:mapreduce-tutorial:Step 23]]: Predefined formats and types. 
 +  * [[.:mapreduce-tutorial:Step 24]]: Mappers, running Java Hadoop jobs. 
 +  * [[.:mapreduce-tutorial:Step 25]]: Reducers, combiners and partitioners. 
 +  * [[.:mapreduce-tutorial:Step 26]]: Counters, compression. 
 +  * [[.:mapreduce-tutorial:Step 27]]: Reusing Mapper and Reducer code. 
 + 
 +=== Exercises === 
 +  * Is [[.:mapreduce-tutorial:Step 13]], [[.:mapreduce-tutorial:Step 14]] and [[.:mapreduce-tutorial:Step 15]] enough? 
 + 
 +=== Advanced topics === 
 +  * Custom input format -- WholeFile and WholeFileAsPath 
 +  * Custom data type -- Pair<A, B>
  
 ===== Other ===== ===== Other =====
   * [[user:majlis:hadoop|Further information]]   * [[user:majlis:hadoop|Further information]]
  

[ Back to the navigation ] [ Back to the content ]