[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial [2012/01/13 11:09]
majlis vytvořeno
courses:mapreduce-tutorial [2012/01/25 21:10]
straka
Line 4: Line 4:
  
 ===== Materials ===== ===== Materials =====
-  * [[.:mapreduce:Introduction]]+  * [[.:mapreduce-tutorial:Introduction]]
  
  
-===== Other ====+===== Day 1 ===== 
 +Today we will be using the [[.:mapreduce-tutorial:Perl API]] (there is no need to study it now, the tutorial will explain it). 
 +=== Environment === 
 +  * [[.:mapreduce-tutorial:Step 1]]: Setting the environment. 
 + 
 +=== MapReduce basics === 
 +  * [[.:mapreduce-tutorial:Step 2]]: Input and output format, testing data. 
 +  * [[.:mapreduce-tutorial:Step 3]]: Basic mapper. 
 +  * [[.:mapreduce-tutorial:Step 4]]: Counters. 
 +  * [[.:mapreduce-tutorial:Step 5]]: Basic reducer. 
 + 
 +=== Controlling the cluster === 
 +  * [[.:mapreduce-tutorial:Step 6]]: Running on cluster. 
 +  * [[.:mapreduce-tutorial:Step 7]]: Dynamic Hadoop cluster for several computations. 
 + 
 +From now on, it is best to run MR jobs using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, most notably having only one reducer per job. 
 + 
 +=== MapReduce extended === 
 +  * [[.:mapreduce-tutorial:Step 8]]: Multiple mappers, reducers and partitioning. 
 +  * [[.:mapreduce-tutorial:Step 9]]: Hadoop properties. 
 +  * [[.:mapreduce-tutorial:Step 10]]: Combiners. 
 +  * [[.:mapreduce-tutorial:Step 11]]: Initialization and cleanup of MR tasks, performance of combiners. 
 +  * [[.:mapreduce-tutorial:Step 12]]: Additional output from mappers and reducers. 
 + 
 +=== Advanced MapReduce exercises === 
 +  * [[.:mapreduce-tutorial:Step 13]]: Sorting 
 +  * [[.:mapreduce-tutorial:Step 14]]: N-gram language model 
 +  * [[.:mapreduce-tutorial:Step 15]]: K-means algorithm 
 + 
 +===== Other =====
   * [[user:majlis:hadoop|Further information]]   * [[user:majlis:hadoop|Further information]]
  

[ Back to the navigation ] [ Back to the content ]