Differences
This shows you the differences between two versions of the page.
Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
courses:mapreduce-tutorial:step-15 [2012/01/26 23:19] straka |
courses:mapreduce-tutorial:step-15 [2012/01/29 16:40] (current) straka |
| ''/net/projects/hadoop/examples/inputs/points-large'' | 500000 | 200 | 200 | | | ''/net/projects/hadoop/examples/inputs/points-large'' | 500000 | 200 | 200 | |
| |
When dealing with iterative algorithms, each iteration is usually implemented as one Hadoop job. The Hadoop input_path contains the input data and each mapper also reads the current clusters. The reducers are used to aggregate the data and output new cluster centers. A controlling script is taking care of executing Hadoop jobs and stopping the iteration when the algorithm converges. | When dealing with iterative algorithms, each iteration is usually implemented as one Hadoop job. The Hadoop ''input_path'' should contain the input data and each mapper should also read the current clusters. The reducers are used to aggregate the data and output new cluster centers. A controlling script should take care of executing Hadoop jobs and stopping the iteration when the algorithm converges. |
| |
| ---- |
| |
| <html> |
| <table style="width:100%"> |
| <tr> |
| <td style="text-align:left; width: 33%; "></html>[[step-14|Step 14]]: N-gram language model.<html></td> |
| <td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td> |
| <td style="text-align:right; width: 33%; "></html><html></td> |
| </tr> |
| </table> |
| </html> |