Differences

This shows you the differences between two versions of the page.

--- ufal:tasks [2012/01/23 10:52]
ufal
+++ ufal:tasks [2012/01/23 11:15] (current)
ufal
@@ Line 44: / Line 44: @@
 | **contact:** | |
+=== Tokenizers integrated in Treex ===
+* rule-based (reg.exp.) tokenizers
+* trainable tokenizer TextSeg
 ===== Language Identification ======
+Martin Majliš's language identifier (covers about 100 languages) http://wiki.ufal.ms.mff.cuni.cz/~majlis/publications/master-thesis.pdf
 ===== Sentence Segmentation =====
+=== Segmenters integrated in Treex ===
+* rule-based segmenters
+* TextSeg (trainable)
 ===== Morphological Segmentation =====
 ===== Morphological Analysis =====
+=== Morphological Analyzers integrated in Treex ===
+* Jan Hajič's Czech morphological analyzer
+* toy analyzers for about ten languages (students' homeworks)
 ===== Part-of-Speech Tagging =====
@@ Line 68: / Line 78: @@
 ===== Lemmatization =====
+=== Lemmatizers integrated in Treex ===
+* Martin Popel's lemmatizer for English
+* a number of toy lemmatizers for about ten langauges (students' homeworks)
+* for Czech, lemmatization is traditionally treated as a part of POS disambiguations, so almost all Czech taggers are capable of lemmatization
 ===== Analytical Parsing =====
+=== Analytical parsers integrated in Treex ===
+* Ryan McDonald's MST parser
+* Rudolf Rosa's MST parser
+* MALT parser
+* ZPar
+* Stanford parser
+=== Details on Czech parsing ===
+A Complete Guide to Czech Language Parsing http://ufal.mff.cuni.cz/czech-parsing/
 ===== Tectogrammatical Parsing =====
+=== Conversion of analytical trees to tectogrammatical trees integrated in Treex ===
+* a scenario for rule-based tree transformation
+* Ondřej Dušek's tools for functor assignment trained on PDT and PCEDT
 ===== Named Entity Recognition =====
+=== NE recognizers integrated in Treex ===
+* Jana Straková's SVM based recognizer for Czech http://www.aclweb.org/anthology/W/W09/W09-3538.pdf
+* Stanford Named Entity Recognizer for Czech
 ===== Machine Translation =====
+=== MT implemented in Treex ===
+* elaborated English->Czech tecto-based translation
+* prototype of Czech->English tecto-based translation
 ===== Coreference resolution =====
+=== Coreference resolvers integrated in Treex ===
+* simple rule-based baseline resolvers for Czech and English
+* Michal Novák's trainable resolvers
+* Ngụy Giang Linh's trainable (perceptron-based] resolver
 ===== Spell Checking =====

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences