Differences

This shows you the differences between two versions of the page.

--- ufal:tasks [2012/01/18 12:59]
ufal vytvořeno
+++ ufal:tasks [2012/01/19 12:01]
ufal
@@ Line 1: / Line 1: @@
 ====== Overview of NLP/CL tools available at UFAL ======
-Tokenization
+===== Tokenization (word segmentation) =====
-Language Identification
+Segmentation of text into tokens (words, punctuation marks, etc.). For languages using space-separated words (English. Czech, etc), the taks is relatively easy. For other languages (Chinese, Japanese, etc.) the task is much more difficult.
-Sentence Segmentation
-Morphological Segmentation
-Morphological Analysis
-Part-of-Speech Tagging
-Lemmatization
-Analytical Parsing
-Tectogrammatical Parsing
-Named Entity Recognition
-Machine Translation
-Coreference resolution
-Spell Checking
-Text Similarity
-Recasing
-Rekonstrukce diakritiky
+=== Europarl tokenizer ===
+  * **description:** A sample rule-based tokenizer, can use a list of prefixes which are usually followed by a dot but don't break a sentence. Distributed as a part of the Europarl tools.
+  * **version:** v6 (Jan 2012)
+  * **author:** Philipp Koehn and Josh Schroeder
+  * **licence:** free
+  * **url:** http://www.statmt.org/europarl/
+  * **languages:** in principle applicable to all languages using space-separated words; nonbreaking prefixes available for DE, EL, EN, ES, FR, IT, PT, SV.
+  * **efficiency**: NA
+  * **reference**:
+  * **contact:**
+===== Language Identification ======
+===== Sentence Segmentation =====
+===== Morphological Segmentation =====
+===== Morphological Analysis =====
+===== Part-of-Speech Tagging =====
+===== Lemmatization =====
+===== Analytical Parsing =====
+===== Tectogrammatical Parsing =====
+===== Named Entity Recognition =====
+===== Machine Translation =====
+===== Coreference resolution =====
+===== Spell Checking =====
+===== Text Similarity =====
+===== Recasing =====
+===== Diacritic Reconstruction =====
+====== Other tasks ======
+Word Sense Disambiguation
+Relationship Extraction
+Topic Segmentation
+Information Retrieval
+Information Extraction
+Text Sumarization
+Speech Reconstruction
+Question Answering
+Sentiment Analysis

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences