Differences

This shows you the differences between two versions of the page.

--- courses:mapreduce:introduction [2012/01/13 11:13]
majlis odstraněno
+++ — (current)
@@ Line 1: / Line 1: @@
-====== Large data processing using MapReduce ======
-For an introduction, it is best to read the [[http://fox.auryn.cz/mr/original_paper_dean04.pdf|original paper]].
-There are also Czech [[http://fox.auryn.cz/mr/slides_czech_2009.pdf|slides]] (up to slide 45).
-There are nice slides from the three-day course available at [[http://sites.google.com/site/mriap2008/lectures]].
-I would suggest to start with http://sites.google.com/site/mriap2008/intro_to_mapreduce.pdf .
-Now is good time to solve the following exercises:
-  * create a list of unique words present in a given text
-  * count all bigrams present in a given text
-  * count all n-grams for all n <= N in a given text
-  * with what probability is a word capitalized
-  * given a large corpus, find all undiacritized forms of words present in the corpus and for every such form, compute the most probable diacritization
-  * create an index: given many URL + their text, create for each word  a list of URLs whose text contain this word. For each such URL, produce an ascending list of positions of this word in the document.
-  * implement iterative k-means algorithm
-The following slides discuss solutions to various problems using MR:
-  * http://sites.google.com/site/mriap2008/what_is_mapreduce.pdf
-  * http://sites.google.com/site/mriap2008/word_context_enthropy.pdf
-  * http://sites.google.com/site/mriap2008/hadoop_and_k_means.pdf pages 23-30
-  * http://sites.google.com/site/mriap2008/not_everything_is_nail.pdf (problems difficult for MR)
-There is also a paper about implementing various machine learning algorithms (SVM, EM, Bayes, etc.) using MapReduce on multicore, which is applicable also for distributed computations: [[http://fox.auryn.cz/mr/machine_learning_using_mr_nips06.pdf]].

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences