[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
ufal:tasks [2012/01/23 10:59]
ufal
ufal:tasks [2012/01/23 11:15]
ufal
Line 44: Line 44:
 | **contact:** | | | **contact:** | |
  
 +=== Tokenizers integrated in Treex ===
 +* rule-based (reg.exp.) tokenizers
 +* trainable tokenizer TextSeg
  
 ===== Language Identification ====== ===== Language Identification ======
 +Martin Majliš's language identifier (covers about 100 languages) http://wiki.ufal.ms.mff.cuni.cz/~majlis/publications/master-thesis.pdf
  
 ===== Sentence Segmentation ===== ===== Sentence Segmentation =====
Line 52: Line 56:
  
 ===== Morphological Analysis ===== ===== Morphological Analysis =====
 +=== Morphological Analyzers integrated in Treex ===
 +* Jan Hajič's Czech morphological analyzer
 +* toy analyzers for about ten languages (students' homeworks)
  
 ===== Part-of-Speech Tagging ===== ===== Part-of-Speech Tagging =====
Line 86: Line 93:
  
 ===== Tectogrammatical Parsing ===== ===== Tectogrammatical Parsing =====
 +=== Conversion of analytical trees to tectogrammatical trees integrated in Treex ===
 +* a scenario for rule-based tree transformation
 +* Ondřej Dušek's tools for functor assignment trained on PDT and PCEDT
  
 ===== Named Entity Recognition ===== ===== Named Entity Recognition =====
 +=== NE recognizers integrated in Treex ===
 +* Jana Straková's SVM based recognizer for Czech http://www.aclweb.org/anthology/W/W09/W09-3538.pdf
 +* Stanford Named Entity Recognizer for Czech
  
 ===== Machine Translation ===== ===== Machine Translation =====
 +
 +=== MT implemented in Treex ===
 +* elaborated English->Czech tecto-based translation
 +* prototype of Czech->English tecto-based translation
  
 ===== Coreference resolution ===== ===== Coreference resolution =====
 +=== Coreference resolvers integrated in Treex ===
 +* simple rule-based baseline resolvers for Czech and English
 +* Michal Novák's trainable resolvers
 +* Ngụy Giang Linh's trainable (perceptron-based] resolver
  
 ===== Spell Checking ===== ===== Spell Checking =====

[ Back to the navigation ] [ Back to the content ]