Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial [2012/01/25 00:54] straka |
courses:mapreduce-tutorial [2012/01/28 00:31] straka |
||
---|---|---|---|
Line 21: | Line 21: | ||
* [[.: | * [[.: | ||
* [[.: | * [[.: | ||
- | |||
- | **From now on, run all examples using a one-machine cluster. Running the scripts locally without any cluster has several disadvantages, | ||
=== MapReduce extended === | === MapReduce extended === | ||
- | Setup, cleanup | + | From now on, it is best to run MR jobs using a one-machine cluster -- create a one-machine cluster using '' |
- | Multiple reducers | + | * [[.: |
- | Combiners, | + | * [[.: |
- | Work dir | + | * [[.: |
- | Hadoop | + | * [[.: |
+ | * [[.: | ||
+ | |||
+ | === Advanced MapReduce exercises === | ||
+ | Exercises in this section can be made in any order, but it is recommended to try solving all of them. The [[.: | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | |||
+ | ===== Day 2 ===== | ||
+ | |||
+ | Today we will be using the [[http:// | ||
+ | |||
+ | === Environment === | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | |||
+ | === Java Hadoop | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | * [[.: | ||
+ | |||
+ | === Custom data types and formats === | ||
+ | * Custom data type -- Pair<A, B> | ||
+ | * Custom input format -- WholeFile and WholeFileAsPath | ||
- | N-grams | + | === Exercises === |
- | K-means and Iterations | + | * Inverted index. |
+ | * Is [[.: | ||
===== Other ===== | ===== Other ===== | ||
* [[user: | * [[user: | ||