Both sides previous revision
Previous revision
|
Next revision
Both sides next revision
|
courses:mapreduce-tutorial:step-11 [2012/01/28 22:34] majlis |
courses:mapreduce-tutorial:step-11 [2012/01/28 22:53] majlis Commands for execution were added. |
===== Exercise ===== | ===== Exercise ===== |
| |
Improve the {{:courses:mapreduce-tutorial:step-5-solution1.txt|wc-without-combiner.pl}} script by manually combining the results in the Mapper -- create a hash of word occurrences, populate it during the ''map'' calls without outputting results and finally output all (key, value) pairs in the ''cleanup'' method. | Improve the {{:courses:mapreduce-tutorial:step-5-solution1.txt|step-11-wc-without-combiner.pl}} script by manually combining the results in the Mapper -- create a hash of word occurrences, populate it during the ''map'' calls without outputting results and finally output all (key, value) pairs in the ''cleanup'' method. |
| |
| wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-solution1.txt' -O 'step-11-wc-without-combiner.pl' |
| rm -rf step-11-out-wout; time perl step-11-wc-without-combiner.pl run /home/straka/wiki/cs-text-medium/ step-11-out-wout |
| less step-11-out-wout/part-* |
| |
Measure the improvement. | Measure the improvement. |
| |
| ==== Solution ==== |
| You can also download the solution {{:courses:mapreduce-tutorial:step-11-solution.txt|step-11-wc-with-perl-hash.pl}} and check the correct output. |
| |
| wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-11-solution.txt' -O 'step-11-wc-with-perl-hash.pl' |
| rm -rf step-11-out-with-hash; time perl step-11-wc-with-perl-hash.pl run /home/straka/wiki/cs-text-medium/ step-11-out-with-hash |
| less step-11-out-with-hash/part-* |
| |
| |
| |
This is even more obvious with larger input data: | This is even more obvious with larger input data: |
^ Script ^ Time to complete on ''/home/straka/wiki/cs-text'' ^ | ^ Script ^ Time to complete on ''/home/straka/wiki/cs-text'' ^ Commands ^ |
| {{:courses:mapreduce-tutorial:step-5-solution1.txt|wc-without-combiner.pl}} | 5mins, 4sec | | | {{:courses:mapreduce-tutorial:step-5-solution1.txt|step-11-wc-without-combiner.pl}} | 5mins, 4sec | <html><pre>wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-solution1.txt' -O 'step-11-wc-without-combiner.pl'<br>rm -rf step-11-out-wout; time perl step-11-wc-without-combiner.pl run /home/straka/wiki/cs-text/ step-11-out-wout</pre></html> | |
| {{:courses:mapreduce-tutorial:step-10.txt|wc-with-combiner.pl}} | 5mins, 33sec | | | {{:courses:mapreduce-tutorial:step-10.txt|step-11-wc-with-combiner.pl}} | 5mins, 33sec | <html><pre>wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-10-solution.txt' -O 'step-11-wc-with-combiner.pl'<br>rm -rf step-11-out-with-combiner; time perl step-11-wc-with-combiner.pl run /home/straka/wiki/cs-text/ step-11-out-with-combiner</pre></html>| |
| | {{:courses:mapreduce-tutorial:step-11-solution.txt|step-11-wc-with-perl-hash.pl}} | 2mins, 24sec | <html><pre>wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-11-solution.txt' -O 'step-11-wc-with-perl-hash.pl'<br>rm -rf step-11-out-with-perl-hash; time perl step-11-wc-with-perl-hash.pl run /home/straka/wiki/cs-text/ step-11-out-with-perl-hash</pre></html>| |
| |
| |
| |
| |
| |
| |