Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce-tutorial:step-8 [2012/01/28 17:40]
straka
+++ courses:mapreduce-tutorial:step-8 [2012/01/29 21:04]
straka
@@ Line 1: / Line 1: @@
 ====== MapReduce Tutorial : Multiple mappers, reducers and partitioning ======
-In order to achieve parallelism, mappers and reducers must be executed in parallel.
+A Hadoop job, which is expected to run on many computers at the same time, need to use multiple mappers and reducers. It is possible to control these numbers to some degree.
 ===== Multiple mappers =====
@@ Line 13: / Line 13: @@
 To use multiple reducers, the MR job must be executed by a cluster (even with one computer), not locally. The number of reducers is specified by ''-r'' flag:
   perl script.pl run [-jt cluster_master | -c cluster_size [-w sec_to_wait]] [-r number_of_reducers]
+Optimal number of reducers is the same as the number of machines in the cluster, so that all the reducers can run in parallel at the same time.
 ==== Partitioning ====

Institute of Formal and Applied Linguistics Wiki