Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:mapreduce-tutorial:step-13 [2012/01/25 23:00] straka |
courses:mapreduce-tutorial:step-13 [2012/01/28 23:10] majlis Added links to previous and next chapter. |
||
---|---|---|---|
Line 3: | Line 3: | ||
You are given data consisting of (31-bit integer, string data) pairs. These are available in plain text format: | You are given data consisting of (31-bit integer, string data) pairs. These are available in plain text format: | ||
^ Path ^ Size ^ | ^ Path ^ Size ^ | ||
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
You can assume that the integers are uniformly distributed. | You can assume that the integers are uniformly distributed. | ||
Line 15: | Line 15: | ||
^ Path ^ Size ^ | ^ Path ^ Size ^ | ||
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
- | | /home/straka/hadoop/example-inputs/ | + | | /net/projects/hadoop/examples/inputs/ |
Assume we want to produce //r// output files. One of the solutions is to perform two Hadoop jobs: | Assume we want to produce //r// output files. One of the solutions is to perform two Hadoop jobs: | ||
Line 23: | Line 23: | ||
- Find best //r-1// integer separators using the sampled data. | - Find best //r-1// integer separators using the sampled data. | ||
- Run the second pass, using the separators to guide the partitioning. | - Run the second pass, using the separators to guide the partitioning. | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | < | ||
+ | <table style=" | ||
+ | <tr> | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </tr> | ||
+ | </ | ||
+ | </ | ||