Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
courses:mapreduce-tutorial:step-2 [2012/01/24 08:56] straka |
courses:mapreduce-tutorial:step-2 [2012/01/29 16:03] (current) straka |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== MapReduce tutorial : Input and output format, testing data. ====== | ====== MapReduce tutorial : Input and output format, testing data. ====== | ||
+ | |||
+ | The MapReduce framework is frequently using (key, value) pairs. These pairs can be read from a file and written to a file and there are several formats available. | ||
+ | |||
+ | ===== Input formats ===== | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | The input format can be compressed and will be decompressed transparently by the MR framework. | ||
+ | |||
+ | ===== Output formats ===== | ||
+ | * '' | ||
+ | * '' | ||
+ | The output format can be compressed on demand. | ||
+ | |||
+ | ===== Input data ===== | ||
+ | Testing data are available in several formats and sizes: | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | * ''/ | ||
+ | It is recommended to use the text format in the tutorial, so that both input and output files are readable. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | < | ||
+ | <table style=" | ||
+ | <tr> | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | <td style=" | ||
+ | </tr> | ||
+ | </ | ||
+ | </ | ||