[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:rg:2012:longdtreport [2012/03/12 20:22]
longdt
courses:rg:2012:longdtreport [2012/03/12 22:39]
longdt
Line 9: Line 9:
 How it will run faster and use smaller amount of memory.  How it will run faster and use smaller amount of memory. 
  
-==== Notes ==== +==== Encoding ==== 
 +I. Encoding the count 
 +In web1T corpus, the most frequent n-gram is 95 billion times, but contain only 770 000 unique count.  
 +=> Maintain value rank array is a good way to encode count 
 +II. Encoding the n-gram 
 +**Idea**
 Most of the attendants apparently understood the talk and the paper well, and a Most of the attendants apparently understood the talk and the paper well, and a
 lively discussion followed. One of our first topics of debate was the notion of lively discussion followed. One of our first topics of debate was the notion of

[ Back to the navigation ] [ Back to the content ]