[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:treebanks [2012/03/22 09:57]
zeman Tamil, Telugu and Turkish.
user:zeman:treebanks [2014/05/30 12:53]
zeman HamleDT TO DO přestěhováno na samostatnou stránku.
Line 3: Line 3:
 http://ufal.mff.cuni.cz/hamledt/ http://ufal.mff.cuni.cz/hamledt/
  
 +  * [[user:zeman:treebanks:grc|Ancient Greek (grc)]]
   * [[user:zeman:treebanks:ar|Arabic (ar)]]   * [[user:zeman:treebanks:ar|Arabic (ar)]]
-  * [[user:zeman:treebanks:bg|Bulgarian (bg)]]+  * [[user:zeman:treebanks:eu|Basque (eu)]]
   * [[user:zeman:treebanks:bn|Bengali (bn)]]   * [[user:zeman:treebanks:bn|Bengali (bn)]]
 +  * [[user:zeman:treebanks:bg|Bulgarian (bg)]]
   * [[user:zeman:treebanks:ca|Catalan (ca)]]   * [[user:zeman:treebanks:ca|Catalan (ca)]]
   * [[user:zeman:treebanks:cs|Czech (cs)]]   * [[user:zeman:treebanks:cs|Czech (cs)]]
   * [[user:zeman:treebanks:da|Danish (da)]]   * [[user:zeman:treebanks:da|Danish (da)]]
-  * [[user:zeman:treebanks:de|German (de)]] +  * [[user:zeman:treebanks:nl|Dutch (nl)]]
-  * [[user:zeman:treebanks:el|Greek (el)]]+
   * [[user:zeman:treebanks:en|English (en)]]   * [[user:zeman:treebanks:en|English (en)]]
-  * [[user:zeman:treebanks:es|Spanish (es)]] 
   * [[user:zeman:treebanks:et|Estonian (et)]]   * [[user:zeman:treebanks:et|Estonian (et)]]
-  * [[user:zeman:treebanks:eu|Basque (eu)]] 
-  * [[user:zeman:treebanks:fa|Persian (fa)]] 
   * [[user:zeman:treebanks:fi|Finnish (fi)]]   * [[user:zeman:treebanks:fi|Finnish (fi)]]
-  * [[user:zeman:treebanks:grc|Ancient Greek (grc)]]+  * [[user:zeman:treebanks:de|German (de)]] 
 +  * [[user:zeman:treebanks:el|Greek (el)]]
   * [[user:zeman:treebanks:hi|Hindi (hi)]]   * [[user:zeman:treebanks:hi|Hindi (hi)]]
   * [[user:zeman:treebanks:hu|Hungarian (hu)]]   * [[user:zeman:treebanks:hu|Hungarian (hu)]]
Line 23: Line 22:
   * [[user:zeman:treebanks:ja|Japanese (ja)]]   * [[user:zeman:treebanks:ja|Japanese (ja)]]
   * [[user:zeman:treebanks:la|Latin (la)]]   * [[user:zeman:treebanks:la|Latin (la)]]
-  * [[user:zeman:treebanks:nl|Dutch (nl)]]+  * [[user:zeman:treebanks:fa|Persian (fa)]]
   * [[user:zeman:treebanks:pt|Portuguese (pt)]]   * [[user:zeman:treebanks:pt|Portuguese (pt)]]
   * [[user:zeman:treebanks:ro|Romanian (ro)]]   * [[user:zeman:treebanks:ro|Romanian (ro)]]
   * [[user:zeman:treebanks:ru|Russian (ru)]]   * [[user:zeman:treebanks:ru|Russian (ru)]]
 +  * [[user:zeman:treebanks:sk|Slovak (sk)]]
   * [[user:zeman:treebanks:sl|Slovene (sl)]]   * [[user:zeman:treebanks:sl|Slovene (sl)]]
 +  * [[user:zeman:treebanks:es|Spanish (es)]]
   * [[user:zeman:treebanks:sv|Swedish (sv)]]   * [[user:zeman:treebanks:sv|Swedish (sv)]]
   * [[user:zeman:treebanks:ta|Tamil (ta)]]   * [[user:zeman:treebanks:ta|Tamil (ta)]]
   * [[user:zeman:treebanks:te|Telugu (te)]]   * [[user:zeman:treebanks:te|Telugu (te)]]
   * [[user:zeman:treebanks:tr|Turkish (tr)]]   * [[user:zeman:treebanks:tr|Turkish (tr)]]
 +
 +===== To Process =====
 +
 +Ahoj,
 +stáhl jsem nový španělský závislostní korpus IULA (větší než AnCora)
 +/net/projects/tectomt_shared/data/resources/treebanks/es
 +
 +License:  CC BY 3.0 (Unported)
 +Web:      http://www.iula.upf.edu/recurs01_tbk_uk.htm
 +Doc:      http://www.iula.upf.edu/recurs01_conll_uk.htm
 +Download: http://repositori.upf.edu/handle/10230/20048
 +Parsing:  http://www.taln.upf.edu/system/files/biblio_files/ijcnlp_final_padro_et_al_2013.pdf
 +          state-of-the-art LAS score is 94.7 using Mate-C
 +sentences  42,000
 +tokens    590,000
 +
 +The sentences have been choosed from the IULA LSP corpus, automatically annotated with POS information and manually annotated with syntactical information using the DELPH-IN environment. The resulting syntactic analysis is automatically converted to dependencies and delivered using the CONLL format.
 +
 +Martin

[ Back to the navigation ] [ Back to the content ]