[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ufal:tasks [2012/01/23 11:07]
ufal
ufal:tasks [2012/01/23 11:15] (current)
ufal
Line 44: Line 44:
 | **contact:** | | | **contact:** | |
  
 +=== Tokenizers integrated in Treex ===
 +* rule-based (reg.exp.) tokenizers
 +* trainable tokenizer TextSeg
  
 ===== Language Identification ====== ===== Language Identification ======
 +Martin Majliš's language identifier (covers about 100 languages) http://wiki.ufal.ms.mff.cuni.cz/~majlis/publications/master-thesis.pdf
  
 ===== Sentence Segmentation ===== ===== Sentence Segmentation =====
 +=== Segmenters integrated in Treex ===
 +* rule-based segmenters
 +* TextSeg (trainable)
  
 ===== Morphological Segmentation ===== ===== Morphological Segmentation =====
  
 ===== Morphological Analysis ===== ===== Morphological Analysis =====
 +=== Morphological Analyzers integrated in Treex ===
 +* Jan Hajič's Czech morphological analyzer
 +* toy analyzers for about ten languages (students' homeworks)
  
 ===== Part-of-Speech Tagging ===== ===== Part-of-Speech Tagging =====
Line 102: Line 112:
  
 ===== Coreference resolution ===== ===== Coreference resolution =====
 +=== Coreference resolvers integrated in Treex ===
 +* simple rule-based baseline resolvers for Czech and English
 +* Michal Novák's trainable resolvers
 +* Ngụy Giang Linh's trainable (perceptron-based] resolver
  
 ===== Spell Checking ===== ===== Spell Checking =====

[ Back to the navigation ] [ Back to the content ]