[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Table of Contents

MapReduce Tutorial : Counters

Sometimes it is useful to count events differently than outputting them as (key, value) pairs. For that reason Hadoop offers simple counter framework.

Hadoop maintains a collection of pre-defined and user-defined counters. Every counter is identified by its group name and counter name. The group name and counter name is an arbitrary string – for example group “Documents” and name “Number of words”. To increment a counter, the following code can be used:

sub map {
  my ($self, $key, $value, $context) = @_;
 
  $context->counter($group, $counter, $increment);
}

At the end of computation, Hadoop prints an aggregated value of all the counters.

Exercise

Run a Hadoop job on /home/straka/wiki/cs-text-medium, which uses counters to count the number of articles according to their first letter (ignoring the case). You can download the template step-4-exercise.pl and execute it.

wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-4-exercise.txt' -O 'step-4-exercise.pl'
# NOW EDIT THE FILE
# $EDITOR step-4-exercise.pl
rm -rf step-4-out-ex; perl step-4-exercise.pl /home/straka/wiki/cs-text-medium/ step-4-out-ex
less step-4-out-ex/part-*

Solution

You can also download the solution step-4-solution.pl and check the correct output.

wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-4-solution.txt' -O 'step-4-solution.pl'
# NOW VIEW THE FILE
# $EDITOR step-3-solution.pl
rm -rf step-4-out-sol; perl step-4-solution.pl /home/straka/wiki/cs-text-medium/ step-4-out-sol
less step-4-out-sol/part-*

Step 3: Basic mapper. Overview Step 5: Basic reducer.


[ Back to the navigation ] [ Back to the content ]