[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
courses:mapreduce-tutorial:step-29 [2012/02/05 19:13]
straka
courses:mapreduce-tutorial:step-29 [2012/02/05 19:14] (current)
straka
Line 87: Line 87:
 Improve the [[.:​step-28#​exercise-1|inverted index exercise]] from the previous step to create for each word a //sorted// list of ''​DocWithOccurrences<​Text>''​. Improve the [[.:​step-28#​exercise-1|inverted index exercise]] from the previous step to create for each word a //sorted// list of ''​DocWithOccurrences<​Text>''​.
  
-Use the same approach as with the ''​IntPair''​ -- create a type ''​TextPair'',​ which stores two values of type ''​Text''​ and let the mapper create ''​(TextPair, ​DocWIthOccurrences<​Text>''​ pairs, where the ''​TextPair''​ contains the word and then the document. Provide a ''​FirstOnlyComparator''​ which compares two ''​TextPair''​s using only the word (hint: use [[http://​hadoop.apache.org/​common/​docs/​r1.0.0/​api/​org/​apache/​hadoop/​io/​Text.Comparator.html#​compare(byte[],​%20int,​%20int,​%20byte[],​%20int,​%20int)|Text.Comparator.compare]] when defining the byte version ''​FirstOnlyComparator.compare''​).+Use the same approach as with the ''​IntPair''​ -- create a type ''​TextPair'',​ which stores two values of type ''​Text''​ and let the mapper create ''​(TextPair, ​DocWithOccurrences<​Text>''​ pairs, where the ''​TextPair''​ contains the word and then the document. Provide a ''​FirstOnlyComparator''​ which compares two ''​TextPair''​s using only the word (hint: use [[http://​hadoop.apache.org/​common/​docs/​r1.0.0/​api/​org/​apache/​hadoop/​io/​Text.Comparator.html#​compare(byte[],​%20int,​%20int,​%20byte[],​%20int,​%20int)|Text.Comparator.compare]] when defining the byte version ''​FirstOnlyComparator.compare''​) ​and use it as a grouping comparator.
  
 ---- ----

[ Back to the navigation ] [ Back to the content ]