Next revision
|
Previous revision
Next revision
Both sides next revision
|
courses:mapreduce-tutorial:step-26 [2012/01/27 20:11] straka vytvořeno |
courses:mapreduce-tutorial:step-26 [2012/01/28 15:42] straka |
====== MapReduce Tutorial : ====== | ====== MapReduce Tutorial : Counters and job configuration ====== |
| |
| ===== Counters ===== |
| |
| As in the Perl API, a mapper or a reducer can increment various counters by using ''context.getCounter("Group", "Name").increment(value)'': |
| <code java> |
| public void map(Text key, Text value, Context context) throws IOException, InterruptedException { |
| ... |
| context.getCounter("Group", "Name").increment(value); |
| ... |
| } |
| </code> |
| The ''getCounter'' method returns a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Counter.html|Counter]] object, so if a counter is incremented frequently, the ''getCounter'' method can be called only once: |
| <code java> |
| public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { |
| ... |
| Counter values = context.getCounter("Reducer", "Number of values"); |
| for (IntWritable value : values) { |
| ... |
| values.increment(1); |
| } |
| } |
| </code> |
| |
| ===== Job configuration ===== |
| |
| The job properties can be set: |
| * on the command line -- the [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/util/ToolRunner.html|ToolRunner]] parses options in format ''-Dname=value''. See the [[.:step-24#running-the-job|syntax of the hadoop script]]. |
| * using the [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/Job.html|Job]]''.getConfiguration()'' a [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/conf/Configuration.html|Configuration]] object is retrieved. It provides following methods: |
| * ''String get(String name)'' -- get the value of the ''name'' property, ''null'' if it does not exist. |
| * ''String get(String name, String defaultValue)'' -- get the value of the ''name'' property |
| * ''getBoolean'', ''getClass'', ''getFile'', ''getFloat'', ''getInt'', ''getLong'', ''getStrings'' -- return a typed value of the ''name'' property (i.e., number, file name, class name, ...). |
| * ''set(String name, String value)'' -- set the value of the ''name'' property to ''value''. |
| * ''setBoolean'', ''setClass'', ''setFile'', ''setFloat'', ''setInt'', ''setLong'', ''setStrings'' -- set the typed value of the ''name'' property (i.e., number, file name, class name, ...). |
| * in a mapper or a reducer, the ''context'' object also provides the ''getConfiguration()'' method, so the job properties can be accessed in the mappers and reducers too. |
| |
| Apart from already mentioned [[.:step-9#a-brief-list-of-hadoop-options|brief list of Hadoop properties]], there is one important Java-specific property: |
| * ''mapred.child.java.opts'' with default value ''-Xmx200m''. |