Improving morphology induction by learning spelling rules, ACL 2009

Presented by: Loganathan Ramasamy
Report by: Eduard Bejček

Introduction

Paper describes morphology induction using Bayesian approach
Baseline: Goldwater et al., 2006: Interpolating between types and tokens by estimation power-law generators
- only stem & suffix
- Dirichlet priors over the multinominal distributions for word class, stem and for suffix
Improvements to baseline system
- introduces spelling rules (context/change, e.g.: “ut_i”/ε→t (in “shut.ing”) or “ke_i”/e→ε (in “take.ing”))
- Dirichlet priors first set by hand: to prefer empty rules to deletion/insertion

Experiments are done olny for English and only for verbs – that's constrained too much
Results are not convincing enough – F-score and underlying form accuracy outperform baseline only for stems (not for suffixes) and precision doesn't outperform baseline at all