This is an old revision of the document!
MapReduce Tutorial - Perl API
The main class is Hadoop::Runner:
package Hadoop::Runner;
has 'mapper' => (does => 'Hadoop::Mapper', required => 1);
has 'reducer' => (does => 'Hadoop::Reducer');
has 'combiner' => (does => 'Hadoop::Reducer');
has 'partitioner' => (does => 'Hadoop::Partitioner');
has 'input_format' => (isa => 'InputFormat', default => 'TextInputFormat');
has 'output_format' => (isa => 'OutputFormat', default => 'TextOutputFormat');
has 'output_compression' => (isa => 'Bool', default => 0);
has 'hadoop_prefix' => (isa => 'Str', default => '/SGE/HADOOP/active');
has 'keep_env' => (isa => 'ArrayRef[Str]', default => sub { ["PERLLIB", "PERL5LIB"] });
sub run();
mapper– aHadoop::Mapperto usereducer– an optionalHadoop::Reducerto usecombiner– an optionalHadoop::Reducerto use as combinerpartitioner– an optionalHadoop::Partitionerto useinput_format– one ofTextInputFormat,KeyValueTextInputFormat,SequenceFileInputFormatoutput_format– one ofTextOutputFormat,SequenceFileOutputFormatoutput_compression– Bool flag controlling the compression of outputhadoop_prefix– the prefix of Hadoop instalation. Default value is fine in UFAL cluster.keep_env– which environment variables are preserved when running perl mappers, reducers, combiners and partitioners
