Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
courses:rg:morphology-induction-with-spelling-rules [2011/04/08 13:32] bejcek vytvořeno |
courses:rg:morphology-induction-with-spelling-rules [2011/04/19 13:05] (current) abzianidze |
||
---|---|---|---|
Line 5: | Line 5: | ||
===== Introduction ===== | ===== Introduction ===== | ||
- | * Paper describes morphology induction using Bayesian approach | + | * The paper describes morphology induction using Bayesian approach |
+ | * It is based on the Minimum Description Length (MDL) principle | ||
* Baseline: Goldwater et al., 2006: [[http:// | * Baseline: Goldwater et al., 2006: [[http:// | ||
* only stem & suffix | * only stem & suffix | ||
- | * Dirichlet priors over the multinominal distributions for word class, stem and for suffix | + | * Dirichlet priors over the multinominal distributions for word class, stem and suffix |
- | | + | |
- | * Improvements to baseline system | + | * Improvements to the baseline system |
- | * introduces spelling rules (context/ | + | * introduces spelling rules (context/ |
* Dirichlet priors first set by hand: to prefer empty rules to deletion/ | * Dirichlet priors first set by hand: to prefer empty rules to deletion/ | ||
===== What do we dislike about the paper ===== | ===== What do we dislike about the paper ===== | ||
- | * Experiments are done olny for English and only for verbs -- that's constrained too much | + | * Experiments are done only for English and only for verbs -- that's constrained too much |
* Results are not convincing enough -- F-score and underlying form accuracy outperform baseline only for stems (not for suffixes) and precision doesn' | * Results are not convincing enough -- F-score and underlying form accuracy outperform baseline only for stems (not for suffixes) and precision doesn' | ||
+ | |||
===== What do we like about the paper ===== | ===== What do we like about the paper ===== | ||
* Loganathan has the code (although he couldn' | * Loganathan has the code (although he couldn' | ||
+ | * Spelling rules are also simultaneously learned along with morphological analysis | ||
+ | * It's unsupervised and clever: using just a couple of (hyper)parameters (some of them are learned automatically), |