Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
|
courses:rg:morphology-induction-with-spelling-rules [2011/04/08 13:32] bejcek vytvořeno |
courses:rg:morphology-induction-with-spelling-rules [2011/04/19 13:05] (current) abzianidze |
||
|---|---|---|---|
| Line 5: | Line 5: | ||
| ===== Introduction ===== | ===== Introduction ===== | ||
| - | * Paper describes morphology induction using Bayesian approach | + | * The paper describes morphology induction using Bayesian approach |
| + | * It is based on the Minimum Description Length (MDL) principle | ||
| * Baseline: Goldwater et al., 2006: [[http:// | * Baseline: Goldwater et al., 2006: [[http:// | ||
| * only stem & suffix | * only stem & suffix | ||
| - | * Dirichlet priors over the multinominal distributions for word class, stem and for suffix | + | * Dirichlet priors over the multinominal distributions for word class, stem and suffix |
| - | | + | |
| - | * Improvements to baseline system | + | * Improvements to the baseline system |
| - | * introduces spelling rules (context/ | + | * introduces spelling rules (context/ |
| * Dirichlet priors first set by hand: to prefer empty rules to deletion/ | * Dirichlet priors first set by hand: to prefer empty rules to deletion/ | ||
| ===== What do we dislike about the paper ===== | ===== What do we dislike about the paper ===== | ||
| - | * Experiments are done olny for English and only for verbs -- that's constrained too much | + | * Experiments are done only for English and only for verbs -- that's constrained too much |
| * Results are not convincing enough -- F-score and underlying form accuracy outperform baseline only for stems (not for suffixes) and precision doesn' | * Results are not convincing enough -- F-score and underlying form accuracy outperform baseline only for stems (not for suffixes) and precision doesn' | ||
| + | |||
| ===== What do we like about the paper ===== | ===== What do we like about the paper ===== | ||
| * Loganathan has the code (although he couldn' | * Loganathan has the code (although he couldn' | ||
| + | * Spelling rules are also simultaneously learned along with morphological analysis | ||
| + | * It's unsupervised and clever: using just a couple of (hyper)parameters (some of them are learned automatically), | ||
