[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ufal:tasks [2012/01/23 11:09]
ufal
ufal:tasks [2012/01/23 11:15] (current)
ufal
Line 44: Line 44:
 | **contact:** | | | **contact:** | |
  
 +=== Tokenizers integrated in Treex ===
 +* rule-based (reg.exp.) tokenizers
 +* trainable tokenizer TextSeg
  
 ===== Language Identification ====== ===== Language Identification ======
 +Martin Majliš's language identifier (covers about 100 languages) http://wiki.ufal.ms.mff.cuni.cz/~majlis/publications/master-thesis.pdf
  
 ===== Sentence Segmentation ===== ===== Sentence Segmentation =====
 +=== Segmenters integrated in Treex ===
 +* rule-based segmenters
 +* TextSeg (trainable)
  
 ===== Morphological Segmentation ===== ===== Morphological Segmentation =====
  
 ===== Morphological Analysis ===== ===== Morphological Analysis =====
 +=== Morphological Analyzers integrated in Treex ===
 +* Jan Hajič's Czech morphological analyzer
 +* toy analyzers for about ten languages (students' homeworks)
  
 ===== Part-of-Speech Tagging ===== ===== Part-of-Speech Tagging =====

[ Back to the navigation ] [ Back to the content ]