Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
courses:mapreduce-tutorial:step-23 [2012/01/27 21:58] straka |
courses:mapreduce-tutorial:step-23 [2012/01/31 14:33] (current) dusek |
* ''FloatWritable'' -- 64-bit floating number | * ''FloatWritable'' -- 64-bit floating number |
* ''DoubleWritable'' -- 64-bit floating number | * ''DoubleWritable'' -- 64-bit floating number |
| * ''NullWritable'' -- no value |
For more complicated types like variable-length encoded integers, dictionaries, bloom filters, etc., see [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/Writable.html|Writable]]. | For more complicated types like variable-length encoded integers, dictionaries, bloom filters, etc., see [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/Writable.html|Writable]]. |
| |
==== Types used in Perl API ==== | ==== Types used in Perl API ==== |
The Perl API can process keys and values of any type -- then using different type than ''Text'', ''toString'' method is called to create a ''String'' representation. | The Perl API is always using strings as keys and values. From the Java point of view: |
| * the type of keys and values produced by Perl API is always ''Text''. |
The keys and values produced by Perl API are always of type ''Text''. | * any type can be used as input to Perl API -- if the type is different from ''Text'', a ''toString'' method is used to convert the value to string before the value is passed to Perl. |
| |
===== Input formats ===== | ===== Input formats ===== |
| |
===== Output formats ===== | ===== Output formats ===== |
An input format is a subclass of [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html|FileOutputFormat<K,V>]], where //K// is the type of keys and //V// is the type of values it can store. | An output format is a subclass of [[http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html|FileOutputFormat<K,V>]], where //K// is the type of keys and //V// is the type of values it can store. |
| |
Available output formats: | Available output formats: |
* ''TextOutputFormat'': The type of both keys and values is ''Text''. | * ''TextOutputFormat'': The type of both keys and values is ''Text''. |
* ''SequenceFileOutputFormat'': Any type of keys and values can be used. | * ''SequenceFileOutputFormat'': Any type of keys and values can be used. |
| |
| ---- |
| |
| <html> |
| <table style="width:100%"> |
| <tr> |
| <td style="text-align:left; width: 33%; "></html>[[step-22|Step 22]]: Optional – Setting Eclipse.<html></td> |
| <td style="text-align:center; width: 33%; "></html>[[.|Overview]]<html></td> |
| <td style="text-align:right; width: 33%; "></html>[[step-24|Step 24]]: Mappers, running Java Hadoop jobs, combiners.<html></td> |
| </tr> |
| </table> |
| </html> |
| |