Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
courses:mapreduce-tutorial:step-2 [2012/01/24 08:56] straka |
courses:mapreduce-tutorial:step-2 [2012/01/29 16:03] (current) straka |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== MapReduce tutorial : Input and output format, testing data. ====== | ====== MapReduce tutorial : Input and output format, testing data. ====== | ||
| + | |||
| + | The MapReduce framework is frequently using (key, value) pairs. These pairs can be read from a file and written to a file and there are several formats available. | ||
| + | |||
| + | ===== Input formats ===== | ||
| + | * '' | ||
| + | * '' | ||
| + | * '' | ||
| + | The input format can be compressed and will be decompressed transparently by the MR framework. | ||
| + | |||
| + | ===== Output formats ===== | ||
| + | * '' | ||
| + | * '' | ||
| + | The output format can be compressed on demand. | ||
| + | |||
| + | ===== Input data ===== | ||
| + | Testing data are available in several formats and sizes: | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | * ''/ | ||
| + | It is recommended to use the text format in the tutorial, so that both input and output files are readable. | ||
| + | |||
| + | ---- | ||
| + | |||
| + | < | ||
| + | <table style=" | ||
| + | <tr> | ||
| + | <td style=" | ||
| + | <td style=" | ||
| + | <td style=" | ||
| + | </tr> | ||
| + | </ | ||
| + | </ | ||
