Differences

This shows you the differences between two versions of the page.

--- courses:rg:multilingual-noise-robust-supervised-morphological-analysis-using-the-wordframe-model [2011/01/07 15:11]
kirschner vytvořeno
+++ courses:rg:multilingual-noise-robust-supervised-morphological-analysis-using-the-wordframe-model [2011/01/09 17:45]
kirschner
@@ Line 5: / Line 5: @@
 ===== Comments =====
-  *
+=== Summary ===
+  * In this paper the author presents a new supervized method for lemmatization, called WordFrame model.
+  * This new method is compared to existing End-Of-String method and is proven better in most of the cases.
+    * A combination of both methods gives even better results.
+  * The results are evaulated on 30 different languages with median accuracy 97.5%
+  * The WordFrame model algorithm trains well on noisy data, therefore it can be used in co-training with unsupervised methods.
+=== Described models ===
+Both models described in this paper were ment to decompose the word to some basic parts (not morphemes, but similar).
+==Extended End-of-String model==
+Decomposition of inflection into
+  * prefix - //concatenation of all prefixes//
+  * primary common substring - //the stem//
+  * point of suffixation change - //phonologicaly induced letter change on the boundary of stem and suffix//
+  * suffix/ending - //concatenation of all suffixes of the word//
+==WordFrame model==
+Decomposition of inflection into
+  * prefix - //concatenation of all prefixes//
+  * point of prefixation change - //phonologicaly induced letter change on the boundary of first part of stem and prefix//
+  * secondary common substring - //the part of stem before stem vowel change//
+  * vowel change - //the vowel change inside the stem//
+  * primary common substring - //the part of stem after the vowel change//
+  * point of suffixation change - //phonologicaly induced letter change on the boundary of stem and suffix//
+  * suffix/ending - //concatenation of all suffixes of the word//
 ===== Suggested Additional Reading =====
+   * [[http://www.cs.swarthmore.edu/~richardw/pubs/thesis.pdf|R. Wicentowski, 2002, PhD thesis]]
@@ Line 14: / Line 40: @@
 ===== What do we like about the paper =====
-  *
+  * Robustness of the algorithm in noisy conditions
+  * Evaluation on many different languages
 ===== What do we dislike about the paper =====
-  *
+  * Doesn't do morphological analysis, only lemmatization
+  * Experiments done only on verbs
+  * The paper doesn't say, what option the algorithm selects if there are more possible correct results
+  * The algorithm only uses features based only on the word itself, it doesn't use context
+  * With information given in this paper, we wouldn't be able to create a program to review the results
+===== Questions =====
+  * Does the term //point of prefixation// mean the same as the term //morpheme boundary//?
 Written by Martin Kirschner

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences