[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
user:zeman:treebanks [2012/01/17 10:41]
zeman Swedish.
user:zeman:treebanks [2014/07/17 17:43] (current)
zeman Croatian.
Line 1: Line 1:
 ====== Treebanks for Various Languages ====== ====== Treebanks for Various Languages ======
  
 +http://ufal.mff.cuni.cz/hamledt/ nebo [[hamledt|HamleDT ve Wiki]]
 +
 +  * [[user:zeman:treebanks:grc|Ancient Greek (grc)]]
   * [[user:zeman:treebanks:ar|Arabic (ar)]]   * [[user:zeman:treebanks:ar|Arabic (ar)]]
-  * [[user:zeman:treebanks:bg|Bulgarian (bg)]]+  * [[user:zeman:treebanks:eu|Basque (eu)]]
   * [[user:zeman:treebanks:bn|Bengali (bn)]]   * [[user:zeman:treebanks:bn|Bengali (bn)]]
 +  * [[user:zeman:treebanks:bg|Bulgarian (bg)]]
   * [[user:zeman:treebanks:ca|Catalan (ca)]]   * [[user:zeman:treebanks:ca|Catalan (ca)]]
 +  * [[user:zeman:treebanks:hr|Croatian (hr)]]
   * [[user:zeman:treebanks:cs|Czech (cs)]]   * [[user:zeman:treebanks:cs|Czech (cs)]]
   * [[user:zeman:treebanks:da|Danish (da)]]   * [[user:zeman:treebanks:da|Danish (da)]]
-  * [[user:zeman:treebanks:de|German (de)]] +  * [[user:zeman:treebanks:nl|Dutch (nl)]]
-  * [[user:zeman:treebanks:el|Greek (el)]]+
   * [[user:zeman:treebanks:en|English (en)]]   * [[user:zeman:treebanks:en|English (en)]]
-  * [[user:zeman:treebanks:es|Spanish (es)]] 
   * [[user:zeman:treebanks:et|Estonian (et)]]   * [[user:zeman:treebanks:et|Estonian (et)]]
-  * [[user:zeman:treebanks:eu|Basque (eu)]] 
   * [[user:zeman:treebanks:fi|Finnish (fi)]]   * [[user:zeman:treebanks:fi|Finnish (fi)]]
-  * [[user:zeman:treebanks:grc|Ancient Greek (grc)]]+  * [[user:zeman:treebanks:de|German (de)]] 
 +  * [[user:zeman:treebanks:el|Greek (el)]]
   * [[user:zeman:treebanks:hi|Hindi (hi)]]   * [[user:zeman:treebanks:hi|Hindi (hi)]]
   * [[user:zeman:treebanks:hu|Hungarian (hu)]]   * [[user:zeman:treebanks:hu|Hungarian (hu)]]
Line 20: Line 23:
   * [[user:zeman:treebanks:ja|Japanese (ja)]]   * [[user:zeman:treebanks:ja|Japanese (ja)]]
   * [[user:zeman:treebanks:la|Latin (la)]]   * [[user:zeman:treebanks:la|Latin (la)]]
-  * [[user:zeman:treebanks:nl|Dutch (nl)]]+  * [[user:zeman:treebanks:fa|Persian (fa)]]
   * [[user:zeman:treebanks:pt|Portuguese (pt)]]   * [[user:zeman:treebanks:pt|Portuguese (pt)]]
   * [[user:zeman:treebanks:ro|Romanian (ro)]]   * [[user:zeman:treebanks:ro|Romanian (ro)]]
   * [[user:zeman:treebanks:ru|Russian (ru)]]   * [[user:zeman:treebanks:ru|Russian (ru)]]
 +  * [[user:zeman:treebanks:sk|Slovak (sk)]]
   * [[user:zeman:treebanks:sl|Slovene (sl)]]   * [[user:zeman:treebanks:sl|Slovene (sl)]]
 +  * [[user:zeman:treebanks:es|Spanish (es)]]
   * [[user:zeman:treebanks:sv|Swedish (sv)]]   * [[user:zeman:treebanks:sv|Swedish (sv)]]
 +  * [[user:zeman:treebanks:ta|Tamil (ta)]]
 +  * [[user:zeman:treebanks:te|Telugu (te)]]
 +  * [[user:zeman:treebanks:tr|Turkish (tr)]]
 +
 +===== To Process =====
 +
 +Ahoj,
 +stáhl jsem nový španělský závislostní korpus IULA (větší než AnCora)
 +/net/projects/tectomt_shared/data/resources/treebanks/es
 +
 +License:  CC BY 3.0 (Unported)
 +Web:      http://www.iula.upf.edu/recurs01_tbk_uk.htm
 +Doc:      http://www.iula.upf.edu/recurs01_conll_uk.htm
 +Download: http://repositori.upf.edu/handle/10230/20048
 +Parsing:  http://www.taln.upf.edu/system/files/biblio_files/ijcnlp_final_padro_et_al_2013.pdf
 +          state-of-the-art LAS score is 94.7 using Mate-C
 +sentences  42,000
 +tokens    590,000
 +
 +The sentences have been choosed from the IULA LSP corpus, automatically annotated with POS information and manually annotated with syntactical information using the DELPH-IN environment. The resulting syntactic analysis is automatically converted to dependencies and delivered using the CONLL format.
 +
 +Martin

[ Back to the navigation ] [ Back to the content ]