[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:rg:2012:longdtreport [2012/03/12 20:11]
longdt vytvořeno
courses:rg:2012:longdtreport [2012/03/12 20:17]
longdt
Line 1: Line 1:
-Test+====== Faster and Smaller N-Gram Language Model ====== 
 +//Presenter :  Joachim Daiber 
 +Subscriber : Long DT// 
 +Date : 12-March-2012\\ 
 + 
 +==== Overview ==== 
 + 
 +On Monday, October 24th 2011, we heard a talk about a paper by Valentin 
 +Spitkovsky, Hiyan Alshawi and Daniel Jurafsky on enhancing unsupervised language 
 +parsers. The paper itself focuses on improving the state of the art in 
 +unsupervised parsing, and reports a success in a rate of percents, which 
 +certainly makes it a paper worth notice. 
 + 
 +==== Notes ==== 
 + 
 +Most of the attendants apparently understood the talk and the paper well, and a 
 +lively discussion followed. One of our first topics of debate was the notion of 
 +skyline presented in the paper. The skyline was somewhat of a supervised element 
 +-- the authors estimated initial parameters for a model from gold data and 
 +trained it afterwards. They assumed that a model with parameters estimated from 
 +gold data cannot be beaten by an unsupervisedly trained model. Verily, after 
 +training the skyline model, its accuracy dropped very significantly. The reasons 
 +of this were a point of surprise for us as well as for the paper's authors. 
 + 
 +Complementary to the skyline, the authors presented a baseline which should 
 +definitely be beaten by their final model. This baseline, they called 
 +"uninformed", but were vague about which exact probability distribution they 
 +used in this model. We could only speculate it was a uniform or random 
 +probability distribution. 
 + 
 +A point about unsupervised language modeling came out: Many linguistic phenomena 
 +are annotated in a way that is to some extent arbitrary, and reflects more the 
 +linguistic theory used than the language itself, and an unsupervised model 
 +cannot hope to get them right. The example we discussed was whether the word 
 +"should" is governing the verb it's bound with, or vice versa. The authors 
 +noticed that dependency orientation in general was not a particularly strong 
 +point of their parser, and so they also included an evaluation metric that 
 +ignored the dependency orientations. 
 + 
 +Perhaps the most crucial observations the authors made was that there is a limit 
 +where feeding more data to the model training hurts its accuracy. They 
 +progressed from short sentences to longer, and identified the threshold, where 
 +it's best to start ignoring any more training data, at sentences of length 15. 
 +However, we were not 100% clear how they computed this constant. 
 +If the model was to be fully unsupervised, it remains a question, how to setup 
 +this threshold, because it cannot be safely assumed that it would be the same 
 +for all languages and setups. 
 + 
 +The writing style of the paper was also a matter of differing opinions. 
 +Undeniably, it is written in a vocabulary-intensive fashion, bringing readers 
 +face to face with words like "unbridled" or "jettison", which I personally had 
 +never seen before. 
 + 
 +==== Conclusion ==== 
 + 
 +All in all, it was a paper worth reading, well presented, and thoroughly 
 +discussed, bringing useful general ideas as well as interesting details.

[ Back to the navigation ] [ Back to the content ]