Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
courses:rg:from-baby-steps-to-leapfrog-how-less-is-more-in-unsupervised-dependency-parsing:majlis [2011/10/05 15:59] majlis vytvořeno |
courses:rg:from-baby-steps-to-leapfrog-how-less-is-more-in-unsupervised-dependency-parsing:majlis [2011/10/05 16:36] (current) majlis |
||
---|---|---|---|
Line 3: | Line 3: | ||
* [[http:// | * [[http:// | ||
- | ===== Notes ===== | + | ===== 1 Introduction |
+ | * [[http:// | ||
+ | * Headden et al.’s (2009) Extended Valence Grammar (EVG) combats data sparsity with smoothing alone, training on the same small subset of the tree-bank as the classic implementation of the DMV; | ||
+ | * Cohen and Smith (2009) use more complicated algorithms (variational EM and MBR decoding) and stronger linguistic hints (tying related parts of speech and syntactically similar bilingual data) | ||
+ | |||
+ | ===== 2 Intuition ===== | ||
+ | ==== Baby Steps ==== | ||
+ | * Its base case — sentences of length one — has a trivial solution that requires neither initialization nor search yet reveals something of sentence heads. | ||
+ | * The next step — sentences of length one and two — refines initial impressions of heads, introduces dependents, and exposes their identities and relative positions. | ||
+ | * Step k + 1 augments training input to include lengths 1, 2, . . . , k, k + 1 of the full data set and executes local search starting from the (smoothed) model estimated by step k. This truly is grammar induction. | ||
+ | |||
+ | ==== Less is More ==== | ||
+ | * just us-ing simple, short sentences is not enough | ||
+ | * We find a "sweet spot" — sentence lengths that are neither too long (excluding the truly daunting examples) nor too | ||
+ | few (supplying enough accessible information), | ||
+ | |||
+ | ==== Leapfrog ==== | ||
+ | * As an alternative to discarding data, a better use of resources is to combine the results of batch and iterative training up to the sweet spot data gradation, then iterate with a large step size. | ||
+ | |||
+ | |||
+ | |||
+ | |||