[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki

[ Back to the navigation ]


This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:mapreduce-tutorial:step-23 [2012/01/27 21:58]
courses:mapreduce-tutorial:step-23 [2012/01/31 14:33] (current)
Line 21: Line 21:
   * ''FloatWritable'' -- 64-bit floating number   * ''FloatWritable'' -- 64-bit floating number
   * ''DoubleWritable'' -- 64-bit floating number   * ''DoubleWritable'' -- 64-bit floating number
 +  * ''NullWritable'' -- no value
 For more complicated types like variable-length encoded integers, dictionaries, bloom filters, etc., see [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/Writable.html|Writable]]. For more complicated types like variable-length encoded integers, dictionaries, bloom filters, etc., see [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/Writable.html|Writable]].
 ==== Types used in Perl API ==== ==== Types used in Perl API ====
-The Perl API can process keys and values of any type -- then using different type than ''Text'', ''toString'' method is called to create a ''String'' representation. +The Perl API is always using strings as keys and values. From the Java point of view: 
- +  * the type of keys and values produced by Perl API is always ''Text''
-The keys and values produced by Perl API are always of type ''Text''.+  * any type can be used as input to Perl API -- if the type is different from ''Text'', ''toString'' method is used to convert the value to string before the value is passed to Perl.
 ===== Input formats ===== ===== Input formats =====
Line 39: Line 40:
 ===== Output formats ===== ===== Output formats =====
-An input format is a subclass of [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html|FileOutputFormat<K,V>]], where //K// is the type of keys and //V// is the type of values it can store.+An output format is a subclass of [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html|FileOutputFormat<K,V>]], where //K// is the type of keys and //V// is the type of values it can store.
 Available output formats: Available output formats:
   * ''TextOutputFormat'': The type of both keys and values is ''Text''.   * ''TextOutputFormat'': The type of both keys and values is ''Text''.
   * ''SequenceFileOutputFormat'': Any type of keys and values can be used.   * ''SequenceFileOutputFormat'': Any type of keys and values can be used.
 +<table style="width:100%">
 +<td style="text-align:left; width: 33%; "></html>[[step-22|Step 22]]: Optional – Setting Eclipse.<html></td>
 +<td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td>
 +<td style="text-align:right; width: 33%; "></html>[[step-24|Step 24]]: Mappers, running Java Hadoop jobs, combiners.<html></td>

[ Back to the navigation ] [ Back to the content ]