[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-24 [2012/01/31 11:29]
straka
courses:mapreduce-tutorial:step-24 [2012/01/31 12:16]
straka
Line 105: Line 105:
 ===== Counters ===== ===== Counters =====
  
-As in the Perl API, a mapper or a reducer can increment various counters by using ''context.getCounter("Group", "Name").increment(value)'':+As in the Perl API, a mapper (or a reducercan increment various counters by using ''context.getCounter("Group", "Name").increment(value)'':
 <code java> <code java>
 public void map(Text key, Text value, Context context) throws IOException, InterruptedException { public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
Line 115: Line 115:
 The ''getCounter'' method returns a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Counter.html|Counter]] object, so if a counter is incremented frequently, the ''getCounter'' method can be called only once: The ''getCounter'' method returns a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Counter.html|Counter]] object, so if a counter is incremented frequently, the ''getCounter'' method can be called only once:
 <code java> <code java>
-public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {+public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
   ...   ...
-  Counter values = context.getCounter("Reducer", "Number of values"); +  Counter words = context.getCounter("Mapper", "Number of words"); 
-  for (IntWritable value values) {+  for (String word value.toString().split("\\W+")) {
     ...     ...
-    values.increment(1);+    words.increment(1);
   }   }
 } }
Line 126: Line 126:
  
 ===== Example 2 ===== ===== Example 2 =====
 +
 +Run a Hadoop job on /home/straka/wiki/cs-text-small, which filter the documents so that only three letter words remain. Also use counters to count the histogram of words lengths and to compute the percentage of three letter words in the documents. You can download the template {{:courses:mapreduce-tutorial:step-24.txt|ThreeLetterWords.java}} and execute it.
 +
 +  wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-24.txt' -O 'ThreeLetterWords.java'
 +  # NOW VIEW THE FILE
 +  # $EDITOR ThreeLetterWords.java
 +  make -f /net/projects/hadoop/java/Makefile ThreeLetterWords.jar
 +  rm -rf step-24-out-sol; /net/projects/hadoop/bin/hadoop ThreeLetterWords.jar -r 0 /home/straka/wiki/cs-text-small step-24-out-sol
 +  less step-24-out-sol/part-*
  
 ---- ----

[ Back to the navigation ] [ Back to the content ]