Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
user:zeman:treebanks:eu [2011/11/28 23:37] zeman vytvořeno |
user:zeman:treebanks:eu [2011/11/29 09:34] zeman Domain. |
||
---|---|---|---|
Line 23: | Line 23: | ||
* Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Aitziber Atutxa, Arantza Díaz de Ilarraza, Aitzpea Garmendia, Maite Oronoz: [[http:// | * Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Aitziber Atutxa, Arantza Díaz de Ilarraza, Aitzpea Garmendia, Maite Oronoz: [[http:// | ||
* Documentation | * Documentation | ||
- | * Description of tags and feature values is provided in the '' | + | * Description of tags and feature values is hard to find; the '' |
+ | * María Jesús Aranzabe, José Mari Arriola, Aitziber Atutxa, Irene Balza, Larraitz Uria: [[http:// | ||
==== Domain ==== | ==== Domain ==== | ||
- | Mixed (“GDT consists of randomly selected textual fragments | + | Newswire + unknown |
==== Size ==== | ==== Size ==== | ||
Line 36: | Line 37: | ||
The syntactic annotation style and the tagset for dependency relations (analytical functions) in GDT has been modeled after the [[http:// | The syntactic annotation style and the tagset for dependency relations (analytical functions) in GDT has been modeled after the [[http:// | ||
+ | |||
+ | Part of speech tag description (obtained per e-mail from Koldo Gojenola, thanks!): | ||
+ | |||
+ | * IZE = noun | ||
+ | * ARR = common | ||
+ | * IZB = proper name | ||
+ | * LIB = place name | ||
+ | * ZKI = number | ||
+ | * ADJ = adjective | ||
+ | * ARR = common | ||
+ | * GAL = question | ||
+ | * ADI = verb | ||
+ | * SIN = simple | ||
+ | * ADK = composed | ||
+ | * ADP = periphrastic | ||
+ | * FAK = factitive | ||
+ | * ADB = adverb | ||
+ | * ARR = common | ||
+ | * GAL = question | ||
+ | * DET = determiner | ||
+ | * ERKARR = demonstrative common | ||
+ | * ERKIND = demonstrative emphatic | ||
+ | * NOLARR = indefinite common | ||
+ | * NOLGAL = indefinite question | ||
+ | * ZNB = number | ||
+ | * DZH = definite | ||
+ | * BAN = distributive | ||
+ | * ORD = ordinal | ||
+ | * DZG = indefinite | ||
+ | * ORO = general | ||
+ | * IOR = pronoun | ||
+ | * PERARR = personal common | ||
+ | * PERIND = personal emphatic | ||
+ | * IZGMGB = indefinite | ||
+ | * IZGGAL = question | ||
+ | * BIH = ??? | ||
+ | * ELK = ??? | ||
+ | * LOT = link | ||
+ | * LOK = connector | ||
+ | * JNT = conjunction | ||
+ | * PRT = particle | ||
+ | * ITJ = interjection | ||
+ | * BST = other | ||
+ | * ADL = auxiliary verb | ||
+ | * ADT = synthetic verb | ||
+ | * SIG = acronym | ||
+ | * SNB = symbol | ||
+ | * LAB = abbreviation | ||
+ | |||
+ | Main features: | ||
+ | |||
+ | * KAS = case (ERG = ergative, ABS = absolutive, DAT = dative...) | ||
+ | * ASP = aspect | ||
+ | * ERL = relation (relative sentence, completive sentence, indirect question...) | ||
==== Sample ==== | ==== Sample ==== |