[[courses:mapreduce-tutorial|MapReduce Tutorial - official course]] ====== Hadoop & MapReduce ====== * [[http://en.wikipedia.org/wiki/Hadoop|Hadoop - Wikipedii]] * [[http://en.wikipedia.org/wiki/MapReduce|MapReduce - Wikipedia]] * [[http://labs.google.com/papers/mapreduce.html|MapReduce: Simplified Data Processing on Large Clusters (original paper)]] + [[http://scholar.google.cz/scholar?cites=10940266603640308767&as_sdt=2005&sciodt=0,5|3650 citations]] ===== Books ===== * [[http://www.umiacs.umd.edu/~jimmylin/book.html|Data-Intensive Text Processing with MapReduce]] - contains links to courses that are using this book * [[http://i.stanford.edu/~ullman/mmds.html|Mining of Massive Datasets]] ===== Tutorials ===== * [[http://developer.yahoo.com/hadoop/tutorial/|Yahoo! Hadoop Tutorial]] * [[http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html|Google: Cluster Computing and MapReduce]] * [[http://code.google.com/edu/submissions/mapreduce/listing.html|Google: MapReduce in a Week]] * [[http://sites.google.com/site/mriap2008/home|MapReduce course]] * [[http://www.slideshare.net/amundtveit/mapreduce-in-search|Mapreduce in Search]] * [[http://michaelnielsen.org/blog/implementing-statistical-machine-translation-using-mapreduce/|Implementing Statistical Machine Translation Using MapReduce]] ===== Scientific Papers ===== * [[http://atbrox.com/2009/10/01/mapreduce-and-hadoop-academic-papers/|Mapreduce & Hadoop Algorithms in Academic Papers 1 (2009-10-01)]] * [[http://atbrox.com/2010/02/12/mapreduce-hadoop-algorithms-in-academic-papers-updated/|Mapreduce & Hadoop Algorithms in Academic Papers 2 (2010-02-12)]] * [[http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/|Mapreduce & Hadoop Algorithms in Academic Papers 3 (2010-05-08)]] * [[http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/|Mapreduce & Hadoop Algorithms in Academic Papers 4 (2011-05-16)]] * [[http://www.mendeley.com/groups/1058401/mapreduce-applications/papers/|Mendeley - MapReduce]] * [[http://www.columbia.edu/~ak2834/mapreduce.html|List of papers]] ===== Courses ===== * [[http://lintool.github.com/Cloud9/]] * [[http://dicta-f11.utcompling.com/schedule|Data-Intensive Computing for Text Analysis]] - contains slides + homeworks * [[http://courses.cs.tamu.edu/caverlee/csce689/|Internet-Scale Data Management]] - each class covers one general topic * [[http://www.eurecom.fr/~michiard/CCSS.html|Summer School on Cloud Computing: Challenges and opportunities]] - 220 slides * [[http://www.stanford.edu/class/cs341/cs341-10-proj/index.html|Project in Mining Massive Data Sets]] ===== Related projects ===== * [[http://mahout.apache.org/|Scalable machine learning and data mining]]