Table of Contents
Linda Wiechetek, Francis M. Tyers, and Thomas Omma: Shooting at Flies in the Dark: Rule-Based Lexical Selection for a Minority Language Pair
Topic
Adding a lexical selection module to a ruled based MT system for translation from North Sami to Lule Sami.
Pluses
- All the resources and source codes are downloadable.
- Grammar rules of the minority languages (which can be dying out) are formalised and thus preserved.
Many real-world examples from discussed languages make the understanding of the paper easier. -MK-
Minuses
- The approach needs a lot of human work.
- The evaluation is unclear and subjective. Sentence pairs which the authors considered unequivalent were removed from the data. There is no baseline approach to compare with.
Questions
- Where does the semantic information come from? Is it from manual annotation?
- Assimilation vs. dissemination. Does it make a difference when we design an MT system?
In a statistical system probably not. In a rule based system, there may be an effort to make the errors predictable and the output easily post-editable when our aim is dissemination. - What is the difference between a minority language and an underresourced language?
We guess that a minority language is a language with a small number of users. There are underresourced languages which can be considered major, like Indonesian.
Assimilation & dissemination criteria can be used for evaluating the performance of the resulting system. -MK-