[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
ufal:tasks [2012/01/18 12:59]
ufal vytvořeno
ufal:tasks [2012/01/18 14:53]
ufal
Line 1: Line 1:
 ====== Overview of NLP/CL tools available at UFAL ====== ====== Overview of NLP/CL tools available at UFAL ======
  
-Tokenization +===== Tokenization (word segmentation) ===== 
-Language Identification +Segmentation of text into tokens (words, punctuation marks, etc.). For languages using space-separated words (English. Czech, etc), the taks is relatively easy. For other languages (Chinese, Japanese, etc.) the task is much more difficult. 
-Sentence Segmentation + 
-Morphological Segmentation +=== Europarl tokenizer === 
-Morphological Analysis +  * **info:** A sample rule-based tokenizer, can use a list of prefixes which are usually followed by a dot but don't break a sentence. Distributed as a part of the Europarl tools. 
-Part-of-Speech Tagging +  * **version:** v6 (Jan 2012)  
-Lemmatization +  * **author:** Philipp Koehn and Josh Schroeder 
-Analytical Parsing +  * **licence:** free 
-Tectogrammatical Parsing +  * **url:** http://www.statmt.org/europarl/ 
-Named Entity Recognition +  * **languages:** applicable to all languages using space-separated words; nonbreaking prefixes available for DE, EL, EN, ES, FR, IT, PT, SV. 
-Machine Translation +  * **efficiency**: NA  
-Coreference resolution +  * **contact:** 
-Spell Checking + 
-Text Similarity +===== Language Identification ====== 
-Recasing + 
-Rekonstrukce diakritiky+===== Sentence Segmentation ===== 
 + 
 +===== Morphological Segmentation ===== 
 + 
 +===== Morphological Analysis ===== 
 + 
 +===== Part-of-Speech Tagging ===== 
 + 
 +===== Lemmatization ===== 
 + 
 +===== Analytical Parsing ===== 
 + 
 +===== Tectogrammatical Parsing ===== 
 + 
 +===== Named Entity Recognition ===== 
 + 
 +===== Machine Translation ===== 
 + 
 +===== Coreference resolution ===== 
 + 
 +===== Spell Checking ===== 
 + 
 +===== Text Similarity ===== 
 + 
 +===== Recasing ===== 
 + 
 +===== Rekonstrukce diakritiky =====
  
  

[ Back to the navigation ] [ Back to the content ]