[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:mapreduce-tutorial:step-24 [2012/01/31 11:29]
straka
courses:mapreduce-tutorial:step-24 [2012/01/31 16:25] (current)
dusek
Line 105: Line 105:
 ===== Counters ===== ===== Counters =====
  
-As in the Perl API, a mapper or a reducer can increment various counters by using ''context.getCounter("Group", "Name").increment(value)'':+As in the Perl API, a mapper (or a reducercan increment various counters by using ''context.getCounter("Group", "Name").increment(value)'':
 <code java> <code java>
 public void map(Text key, Text value, Context context) throws IOException, InterruptedException { public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
Line 115: Line 115:
 The ''getCounter'' method returns a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Counter.html|Counter]] object, so if a counter is incremented frequently, the ''getCounter'' method can be called only once: The ''getCounter'' method returns a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Counter.html|Counter]] object, so if a counter is incremented frequently, the ''getCounter'' method can be called only once:
 <code java> <code java>
-public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {+public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
   ...   ...
-  Counter values = context.getCounter("Reducer", "Number of values"); +  Counter words = context.getCounter("Mapper", "Number of words"); 
-  for (IntWritable value values) {+  for (String word value.toString().split("\\W+")) {
     ...     ...
-    values.increment(1);+    words.increment(1);
   }   }
 } }
Line 126: Line 126:
  
 ===== Example 2 ===== ===== Example 2 =====
 +
 +Run a Hadoop job on /home/straka/wiki/cs-text-small, which filters the documents so that only three-letter words remain. Also use counters to count the histogram of words lengths and to compute the percentage of three letter words in the documents. You can download the template {{:courses:mapreduce-tutorial:step-24.txt|ThreeLetterWords.java}} and execute it.
 +
 +  wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-24.txt' -O 'ThreeLetterWords.java'
 +  # NOW VIEW THE FILE
 +  # $EDITOR ThreeLetterWords.java
 +  make -f /net/projects/hadoop/java/Makefile ThreeLetterWords.jar
 +  rm -rf step-24-out-sol; /net/projects/hadoop/bin/hadoop ThreeLetterWords.jar -r 0 /home/straka/wiki/cs-text-small step-24-out-sol
 +  less step-24-out-sol/part-*
  
 ---- ----

[ Back to the navigation ] [ Back to the content ]