Both sides previous revision
Previous revision
|
Next revision
Both sides next revision
|
courses:mapreduce-tutorial:step-11 [2012/01/28 12:32] majlis |
courses:mapreduce-tutorial:step-11 [2012/01/28 22:31] majlis |
===== Exercise ===== | ===== Exercise ===== |
| |
Improve the {{:courses:mapreduce-tutorial:TODOstep-5-solution1.txt|TODOwc-without-combiner.pl}} script by manually combining the results in the Mapper -- create a hash of word occurrences, populate it during the ''map'' calls without outputting results and finally output all (key, value) pairs in the ''cleanup'' method. | Improve the {{:courses:mapreduce-tutorial:step-5-solution1.txt|wc-without-combiner.pl}} script by manually combining the results in the Mapper -- create a hash of word occurrences, populate it during the ''map'' calls without outputting results and finally output all (key, value) pairs in the ''cleanup'' method. |
| |
Measure the improvement. | Measure the improvement. |
| |
{{:courses:mapreduce-tutorial:step-11-solution.txt|Solution.pl}} | |
| |
===== Combiners and Perl API performance ===== | ===== Combiners and Perl API performance ===== |
This is even more obvious with larger input data: | This is even more obvious with larger input data: |
^ Script ^ Time to complete on ''/home/straka/wiki/cs-text'' ^ | ^ Script ^ Time to complete on ''/home/straka/wiki/cs-text'' ^ |
| {{:courses:mapreduce-tutorial:TODOstep-5-solution1.txt|TODOwc-without-combiner.pl}} | 5mins, 4sec | | | {{:courses:mapreduce-tutorial:step-5-solution1.txt|wc-without-combiner.pl}} | 5mins, 4sec | |
| {{:courses:mapreduce-tutorial:step-10.txt|wc-with-combiner.pl}} | 5mins, 33sec | | | {{:courses:mapreduce-tutorial:step-10.txt|wc-with-combiner.pl}} | 5mins, 33sec | |
| {{:courses:mapreduce-tutorial:step-11-solution.txt|wc-with-perl-hash.pl}} | 2mins, 24sec | | | {{:courses:mapreduce-tutorial:step-11-solution.txt|wc-with-perl-hash.pl}} | 2mins, 24sec | |
| |
| |
For comparison, here are times of Java solutions: | For comparison, here are times of Java solutions: |