[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:treebanks [2012/03/22 10:00]
zeman Alphabetic order.
user:zeman:treebanks [2013/10/09 22:26]
zeman Nový španělský treebank.
Line 32: Line 32:
   * [[user:zeman:treebanks:te|Telugu (te)]]   * [[user:zeman:treebanks:te|Telugu (te)]]
   * [[user:zeman:treebanks:tr|Turkish (tr)]]   * [[user:zeman:treebanks:tr|Turkish (tr)]]
 +
 +===== To Process =====
 +
 +Ahoj,
 +stáhl jsem nový španělský závislostní korpus IULA (větší než AnCora)
 +/net/projects/tectomt_shared/data/resources/treebanks/es
 +
 +License:  CC BY 3.0 (Unported)
 +Web:      http://www.iula.upf.edu/recurs01_tbk_uk.htm
 +Doc:      http://www.iula.upf.edu/recurs01_conll_uk.htm
 +Download: http://repositori.upf.edu/handle/10230/20048
 +Parsing:  http://www.taln.upf.edu/system/files/biblio_files/ijcnlp_final_padro_et_al_2013.pdf
 +          state-of-the-art LAS score is 94.7 using Mate-C
 +sentences  42,000
 +tokens    590,000
 +
 +The sentences have been choosed from the IULA LSP corpus, automatically annotated with POS information and manually annotated with syntactical information using the DELPH-IN environment. The resulting syntactic analysis is automatically converted to dependencies and delivered using the CONLL format.
 +
 +Martin

[ Back to the navigation ] [ Back to the content ]