Differences
This shows you the differences between two versions of the page.
Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
courses:mapreduce-tutorial:step-24 [2012/01/31 11:47] straka |
courses:mapreduce-tutorial:step-24 [2012/01/31 16:25] (current) dusek |
===== Example 2 ===== | ===== Example 2 ===== |
| |
Run a Hadoop job on /home/straka/wiki/cs-text-small, which filter the documents so that only three letter words remain. Also use counters to count the histogram of words lengths and to compute the percentage of three letter words in the documents. You can download the template {{:courses:mapreduce-tutorial:step-24.txt|ThreeLetterWords.java}} and execute it. | Run a Hadoop job on /home/straka/wiki/cs-text-small, which filters the documents so that only three-letter words remain. Also use counters to count the histogram of words lengths and to compute the percentage of three letter words in the documents. You can download the template {{:courses:mapreduce-tutorial:step-24.txt|ThreeLetterWords.java}} and execute it. |
| |
| wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-24.txt' -O 'ThreeLetterWords.java' |
| # NOW VIEW THE FILE |
| # $EDITOR ThreeLetterWords.java |
| make -f /net/projects/hadoop/java/Makefile ThreeLetterWords.jar |
| rm -rf step-24-out-sol; /net/projects/hadoop/bin/hadoop ThreeLetterWords.jar -r 0 /home/straka/wiki/cs-text-small step-24-out-sol |
| less step-24-out-sol/part-* |
| |
---- | ---- |