[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:mapreduce-tutorial:step-8 [2012/01/29 21:12]
straka
courses:mapreduce-tutorial:step-8 [2012/01/31 15:55] (current)
straka
Line 12: Line 12:
  
 To use multiple reducers, the MR job must be executed by a cluster (even with one computer), not locally. The number of reducers is specified by ''-r'' flag: To use multiple reducers, the MR job must be executed by a cluster (even with one computer), not locally. The number of reducers is specified by ''-r'' flag:
-  perl script.pl run [-jt cluster_master | -c cluster_size [-w sec_to_wait]] [-r number_of_reducers]+  perl script.pl [-jt cluster_master | -c cluster_size [-w sec_to_wait]] [-r number_of_reducers]
  
 Optimal number of reducers is the same as the number of machines in the cluster, so that all the reducers can run in parallel at the same time. Optimal number of reducers is the same as the number of machines in the cluster, so that all the reducers can run in parallel at the same time.
Line 28: Line 28:
  
 <code perl> <code perl>
-package Partitioner;+package My::Partitioner;
 use Moose; use Moose;
 with 'Hadoop::Partitioner'; with 'Hadoop::Partitioner';
Line 39: Line 39:
  
 ... ...
-package Main;+package main;
 use Hadoop::Runner; use Hadoop::Runner;
  
 my $runner = Hadoop::Runner->new( my $runner = Hadoop::Runner->new(
   ...   ...
-  partitioner => Partitioner->new(),+  partitioner => My::Partitioner->new(),
   ...);   ...);
 ... ...
Line 52: Line 52:
  
 ===== The order of keys during reduce ===== ===== The order of keys during reduce =====
-It is guaranteed that every reducer processes the keys in //ascending order//.+It is guaranteed that every reducer processes the keys in //ascending lexicographic order//.
  
 On the other hand, the order of values belonging to one key is undefined. On the other hand, the order of values belonging to one key is undefined.
Line 60: Line 60:
 Run one MR job on '/home/straka/wiki/cs-text-medium', which creates two output files -- one with ascending list of unique article names and the other with an ascending list of unique words. You can download the template {{:courses:mapreduce-tutorial:step-8-exercise.txt|step-8-exercise.pl}}  and execute it. Run one MR job on '/home/straka/wiki/cs-text-medium', which creates two output files -- one with ascending list of unique article names and the other with an ascending list of unique words. You can download the template {{:courses:mapreduce-tutorial:step-8-exercise.txt|step-8-exercise.pl}}  and execute it.
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-8-exercise.txt' -O 'step-8-exercise.pl'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-8-exercise.txt' -O 'step-8-exercise.pl'
-  rm -rf step-8-out-ex; perl step-8-exercise.pl run /home/straka/wiki/cs-text-medium/ step-8-out-ex+  # NOW EDIT THE FILE 
 +  # $EDITOR step-8-exercise.pl 
 +  rm -rf step-8-out-ex; perl step-8-exercise.pl -c 2 -r 2 /home/straka/wiki/cs-text-medium/ step-8-out-ex
   less step-8-out-ex/part-*   less step-8-out-ex/part-*
  
Line 66: Line 68:
 You can also download the solution {{:courses:mapreduce-tutorial:step-8-solution.txt|step-8-solution.pl}} and check the correct output. You can also download the solution {{:courses:mapreduce-tutorial:step-8-solution.txt|step-8-solution.pl}} and check the correct output.
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-8-solution.txt' -O 'step-8-solution.pl'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-8-solution.txt' -O 'step-8-solution.pl'
-  rm -rf step-8-out-sol; perl step-8-solution.pl run /home/straka/wiki/cs-text-medium/ step-8-out-sol+  # NOW VIEW THE FILE 
 +  # $EDITOR step-8-solution.pl 
 +  rm -rf step-8-out-sol; perl step-8-solution.pl -c 2 -r 2 /home/straka/wiki/cs-text-medium/ step-8-out-sol
   less step-8-out-sol/part-*   less step-8-out-sol/part-*
  

[ Back to the navigation ] [ Back to the content ]