[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
courses:rg:from-baby-steps-to-leapfrog-how-less-is-more-in-unsupervised-dependency-parsing:majlis [2011/10/05 15:59]
majlis vytvořeno
courses:rg:from-baby-steps-to-leapfrog-how-less-is-more-in-unsupervised-dependency-parsing:majlis [2011/10/05 16:36] (current)
majlis
Line 3: Line 3:
   * [[http://www.aclweb.org/anthology/N/N10/N10-1116.pdf|PDF]]   * [[http://www.aclweb.org/anthology/N/N10/N10-1116.pdf|PDF]]
  
-===== Notes =====+===== 1 Introduction ===== 
 +  * [[http://dl.acm.org/citation.cfm?id=1218955.1219016|Klein and Manning’s (2004) Dependency Model with Valence (DMV)]] - was the first to beat a simple parsing heuristic — the right-branching baseline. 
 +  * Headden et al.’s (2009) Extended Valence Grammar (EVG) combats data sparsity with smoothing alone, training on the same small subset of the tree-bank as the classic implementation of the DMV; 
 +  * Cohen and Smith (2009) use more complicated algorithms (variational EM and MBR decoding) and stronger linguistic hints (tying related parts of speech and syntactically similar bilingual data) 
 + 
 +===== 2 Intuition ===== 
 +==== Baby Steps ==== 
 +  * Its base case — sentences of length one — has a trivial solution that requires neither initialization nor search yet reveals something of sentence heads. 
 +  * The next step — sentences of length one and two — refines initial impressions of heads, introduces dependents, and exposes their identities and relative positions. 
 +  * Step k + 1 augments training input to include lengths 1, 2, . . . , k, k + 1 of the full data set and executes local search starting from the (smoothed) model estimated by step k. This truly is grammar induction. 
 + 
 +==== Less is More ==== 
 +  * just us-ing simple, short sentences is not enough 
 +  * We find a "sweet spot" — sentence lengths that are neither too long (excluding the truly daunting examples) nor too 
 +few (supplying enough accessible information), using Baby Steps’ learning curve as a guide. 
 + 
 +==== Leapfrog ==== 
 +  * As an alternative to discarding data, a better use of resources is to combine the results of batch and iterative training up to the sweet spot data gradation, then iterate with a large step size. 
 + 
 + 
 + 
 + 
  

[ Back to the navigation ] [ Back to the content ]