Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
courses:mapreduce-tutorial:step-9 [2012/01/25 16:16] straka |
courses:mapreduce-tutorial:step-9 [2012/01/25 16:18] straka |
Every Hadoop option has a read-only default. These are overridden by cluster specific options. Lastly, all of these are overriden by job specific options given on the command line (or set using the Java API). | Every Hadoop option has a read-only default. These are overridden by cluster specific options. Lastly, all of these are overriden by job specific options given on the command line (or set using the Java API). |
| |
==== Mapping of Perl options to Hadoop ==== | ===== A brief list of Hadoop options ===== |
^ Perl options ^ Hadoop options ^ | |
| no options \\ (running locally) | ''-Dmapred.job.tracker=local'' \\ ''-Dmapred.local.dir=hadoop-localrunner-tmp'' \\ ''-Dhadoop.tmp.dir=hadoop-localrunner-tmp'' | | |
| ''-jt cluster_master'' | ''-Dmapred.job.tracker=cluster_master'' | | |
| ''-c cluster_machines'' | configuration of new cluster contains \\ ''-Dmapred.job.tracker=cluster_master'' | | |
| ''-r number_of_reducers'' | ''-Dmapred.reduce.tasks=number_of_reducers'' | | |
| |
===== Brief list of Hadoop options ===== | |
^ Hadoop option ^ Default value ^ Description ^ | ^ Hadoop option ^ Default value ^ Description ^ |
| ''mapred.job.tracker'' | ? | Cluster master | | | ''mapred.job.tracker'' | ? | Cluster master | |
| ''mapred.reduce.tasks'' | 1 | Number of reducers | | | ''mapred.reduce.tasks'' | 1 | Number of reducers | |
| ''mapred.min.split.size'' | 1 | Minimum size of file split in bytes | | | ''mapred.min.split.size'' | 1 | Minimum size of file split in bytes | |
| ''mapred.max.split.size'' | 2^63-1 | Minimum size of file split in bytes | | | ''mapred.max.split.size'' | 2%%^%%63-1 | Minimum size of file split in bytes | |
| ''mapred.map.tasks.speculative.execution'' | true | If true, then multiple instances of some map tasks may be executed in parallel | | | ''mapred.map.tasks.speculative.execution'' | true | If true, then multiple instances of some map tasks may be executed in parallel | |
| ''mapred.reduce.tasks.speculative.execution'' | true | If true, then multiple instances of some reduce tasks may be executed in parallel | | | ''mapred.reduce.tasks.speculative.execution'' | true | If true, then multiple instances of some reduce tasks may be executed in parallel | |
| ''mapred.compress.map.output'' | false | Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression | | | ''mapred.compress.map.output'' | false | Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression | |
| |
A more complete list (but not exhaustive) can be found {{http://hadoop.apache.org/common/docs/r1.0.0/mapred-default.html|here}. | A more complete list (but not exhaustive) can be found [[http://hadoop.apache.org/common/docs/r1.0.0/mapred-default.html|here]]. |
| |
| ===== Mapping of Perl options to Hadoop ===== |
| ^ Perl options ^ Hadoop options ^ |
| | no options \\ (running locally) | ''-Dmapred.job.tracker=local'' \\ ''-Dmapred.local.dir=hadoop-localrunner-tmp'' \\ ''-Dhadoop.tmp.dir=hadoop-localrunner-tmp'' | |
| | ''-jt cluster_master'' | ''-Dmapred.job.tracker=cluster_master'' | |
| | ''-c cluster_machines'' | configuration of new cluster contains \\ ''-Dmapred.job.tracker=cluster_master'' | |
| | ''-r number_of_reducers'' | ''-Dmapred.reduce.tasks=number_of_reducers'' | |
| |