[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:treebanks [2011/11/18 18:03]
zeman German size and inside.
user:zeman:treebanks [2011/11/20 18:14]
zeman English size.
Line 1388: Line 1388:
     * Berthold Crysmann, Silvia Hansen-Schirra, George Smith, Dorothea Ziegler-Eisele: [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation/tiger_scheme-morph.pdf|TIGER Morphologie-Annotationsschema]], 2005.     * Berthold Crysmann, Silvia Hansen-Schirra, George Smith, Dorothea Ziegler-Eisele: [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation/tiger_scheme-morph.pdf|TIGER Morphologie-Annotationsschema]], 2005.
     * Stefanie Albert, Jan Anderssen, Regine Bader, Stephanie Becker, Tobias Bracht, Sabine Brants, Thorsten Brants, Vera Demberg, Stefanie Dipper, Peter Eisenberg, Silvia Hansen, Hagen Hirschmann, Juliane Janitzek, Carolin Kirstein, Robert Langner, Lukas Michelbacher, Oliver Plaehn, Cordula Preis, Marcus Pußel, Marco Rower, Bettina Schrader, Anne Schwartz, George Smith, Hans Uszkoreit: [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation/tiger_scheme-syntax.pdf|TIGER Annotationsschema]] //(syntax)//, 2003.     * Stefanie Albert, Jan Anderssen, Regine Bader, Stephanie Becker, Tobias Bracht, Sabine Brants, Thorsten Brants, Vera Demberg, Stefanie Dipper, Peter Eisenberg, Silvia Hansen, Hagen Hirschmann, Juliane Janitzek, Carolin Kirstein, Robert Langner, Lukas Michelbacher, Oliver Plaehn, Cordula Preis, Marcus Pußel, Marco Rower, Bettina Schrader, Anne Schwartz, George Smith, Hans Uszkoreit: [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/annotation/tiger_scheme-syntax.pdf|TIGER Annotationsschema]] //(syntax)//, 2003.
 +    * The header of the XML version of the TIGER Treebank contains lists of various sorts of tags with brief explanation.
  
 ==== Domain ==== ==== Domain ====
Line 1406: Line 1407:
  
 It is not clear what the //semi-automatic// annotation means (probably first auto-tagging, then manual correction?) and whether it also applies to the morphosyntactic annotation. The CoNLL 2009 version also contains automatically disambiguated lemmas, tags and features. It is not clear what the //semi-automatic// annotation means (probably first auto-tagging, then manual correction?) and whether it also applies to the morphosyntactic annotation. The CoNLL 2009 version also contains automatically disambiguated lemmas, tags and features.
 +
 +The original treebank is phrase-based. The dependencies in the CoNLL versions must have thus been drawn using a head-selection procedure. Besides CoNLL data, the TIGER project also provides a subset of the TIGER Treebank in a dependency format.
 +
 +==== Sample ====
 +
 +The first sentence of TIGER Treebank 2.1 in the TIGER-XML format:
 +
 +<code xml><s id="s1">
 +  <graph root="s1_VROOT">
 +    <terminals>
 +      <t id="s1_1" word="``" lemma="--" pos="$(" morph="--" case="--" number="--" gender="--" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_2" word="Ross" lemma="Ross" pos="NE" morph="Nom.Sg.Masc" case="Nom" number="Sg" gender="Masc" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_3" word="Perot" lemma="Perot" pos="NE" morph="Nom.Sg.Masc" case="Nom" number="Sg" gender="Masc" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_4" word="wäre" lemma="sein" pos="VAFIN" morph="3.Sg.Past.Subj" case="--" number="Sg" gender="--" person="3" degree="--" tense="Past" mood="Subj" />
 +      <t id="s1_5" word="vielleicht" lemma="vielleicht" pos="ADV" morph="--" case="--" number="--" gender="--" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_6" word="ein" lemma="ein" pos="ART" morph="Nom.Sg.Masc" case="Nom" number="Sg" gender="Masc" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_7" word="prächtiger" lemma="prächtig" pos="ADJA" morph="Pos.Nom.Sg.Masc" case="Nom" number="Sg" gender="Masc" person="--" degree="Pos" tense="--" mood="--" />
 +      <t id="s1_8" word="Diktator" lemma="Diktator" pos="NN" morph="Nom.Sg.Masc" case="Nom" number="Sg" gender="Masc" person="--" degree="--" tense="--" mood="--" />
 +      <t id="s1_9" word="''" lemma="--" pos="$(" morph="--" case="--" number="--" gender="--" person="--" degree="--" tense="--" mood="--" />
 +    </terminals>
 +    <nonterminals>
 +      <nt id="s1_500" cat="PN">
 +        <edge label="PNC" idref="s1_2" />
 +        <edge label="PNC" idref="s1_3" />
 +      </nt>
 +      <nt id="s1_501" cat="NP">
 +        <edge label="NK" idref="s1_6" />
 +        <edge label="NK" idref="s1_7" />
 +        <edge label="NK" idref="s1_8" />
 +      </nt>
 +      <nt id="s1_502" cat="S">
 +        <edge label="SB" idref="s1_500" />
 +        <edge label="HD" idref="s1_4" />
 +        <edge label="MO" idref="s1_5" />
 +        <edge label="PD" idref="s1_501" />
 +      </nt>
 +      <nt id="s1_VROOT" cat="VROOT">
 +        <edge label="--" idref="s1_1" />
 +        <edge label="--" idref="s1_502" />
 +        <edge label="--" idref="s1_9" />
 +      </nt>
 +    </nonterminals>
 +  </graph>
 +</s></code>
 +
 +The first sentence of the CoNLL 2006 training data:
 +
 +| 1 | `` | _ | $( | $( | _ | 4 | PUNC | 4 | PUNC |
 +| 2 | Ross | _ | NE | NE | _ | 4 | SB | 4 | SB |
 +| 3 | Perot | _ | NE | NE | _ | 2 | PNC | 2 | PNC |
 +| 4 | wäre | _ | VAFIN | VAFIN | _ | 0 | ROOT | 0 | ROOT |
 +| 5 | vielleicht | _ | ADV | ADV | _ | 4 | MO | 4 | MO |
 +| 6 | ein | _ | ART | ART | _ | 8 | NK | 8 | NK |
 +| 7 | prächtiger | _ | ADJA | ADJA | _ | 8 | NK | 8 | NK |
 +| 8 | Diktator | _ | NN | NN | _ | 4 | PD | 4 | PD |
 +| 9 | <nowiki>''</nowiki> | _ | $( | $( | _ | 4 | PUNC | 4 | PUNC |
 +
 +The first sentence of the CoNLL 2006 test data:
 +
 +| 1 | Zwei | _ | CARD | CARD | _ | 2 | NK | 2 | NK |
 +| 2 | Themen | _ | NN | NN | _ | 14 | SB | 14 | SB |
 +| 3 | , | _ | $, | $, | _ | 2 | PUNC | 2 | PUNC |
 +| 4 | die | _ | PRELS | PRELS | _ | 8 | OA | 8 | OA |
 +| 5 | Perot | _ | NE | NE | _ | 8 | SB | 8 | SB |
 +| 6 | immer | _ | ADV | ADV | _ | 7 | MO | 7 | MO |
 +| 7 | wieder | _ | ADV | ADV | _ | 8 | MO | 8 | MO |
 +| 8 | anspricht | _ | VVFIN | VVFIN | _ | 2 | RC | 2 | RC |
 +| 9 | , | _ | $, | $, | _ | 2 | PUNC | 2 | PUNC |
 +| 10 | Rezession | _ | NN | NN | _ | 2 | APP | 2 | APP |
 +| 11 | und | _ | KON | KON | _ | 10 | CD | 10 | CD |
 +| 12 | Bürokratie | _ | NN | NN | _ | 10 | CJ | 10 | CJ |
 +| 13 | , | _ | $, | $, | _ | 14 | PUNC | 14 | PUNC |
 +| 14 | machen | _ | VVFIN | VVFIN | _ | 0 | ROOT | 0 | ROOT |
 +| 15 | ihnen | _ | PPER | PPER | _ | 18 | DA | 18 | DA |
 +| 16 | besonders | _ | ADV | ADV | _ | 18 | MO | 18 | MO |
 +| 17 | zu | _ | PTKZU | PTKZU | _ | 18 | PM | 18 | PM |
 +| 18 | schaffen | _ | VVINF | VVINF | _ | 14 | OC | 14 | OC |
 +| 19 | . | _ | $. | $. | _ | 14 | PUNC | 14 | PUNC |
 +
 +The first sentence of the CoNLL 2009 training data:
 +
 +| 1 | `` | _ | `` | $( | $( | _ | _ | 4 | 4 | PUNC | PUNC | _ | _ |
 +| 2 | Ross | Ross | Roß | NE | NN | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | _ | 3 | 3 | PNC | PNC | _ | _ |
 +| 3 | Perot | Perot | Perot | NE | NE | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | _ | 4 | 4 | SB | SB | _ | _ |
 +| 4 | wäre | sein | sein | VAFIN | VAFIN | 3<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Past<nowiki>|</nowiki>Subj | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Past<nowiki>|</nowiki>Subj | 0 | 0 | ROOT | ROOT | _ | _ |
 +| 5 | vielleicht | vielleicht | vielleicht | ADV | ADV | _ | _ | 4 | 4 | MO | MO | _ | _ |
 +| 6 | ein | ein | ein | ART | ART | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>* | 8 | 8 | NK | NK | _ | _ |
 +| 7 | prächtiger | prächtig | prächtig | ADJA | ADJA | Pos<nowiki>|</nowiki>Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>*<nowiki>|</nowiki>*<nowiki>|</nowiki>* | 8 | 8 | NK | NK | _ | _ |
 +| 8 | Diktator | Diktator | Diktator | NN | NN | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | 4 | 4 | PD | PD | _ | _ |
 +| 9 | <nowiki>''</nowiki> | _ | <nowiki>''</nowiki> | $( | $( | _ | _ | 4 | 4 | PUNC | PUNC | _ | _ |
 +
 +The first sentence of the CoNLL 2009 development data:
 +
 +| 1 | Maschinenbau | Maschinenbau | Maschinenbau | NN | NN | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Masc | 0 | 4 | ROOT | NK | _ | _ |
 +| 2 | / | _ | / | $( | $( | _ | _ | 0 | 1 | PUNC | PUNC | _ | _ |
 +| 3 | ( | _ | ( | $( | $( | _ | _ | 0 | 4 | PUNC | PUNC | _ | _ |
 +| 4 | Zusammenfassung | Zusammenfassung | Zusammenfassung | NN | NN | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | 0 | 0 | ROOT | ROOT | _ | _ |
 +| 5 | ) | _ | ) | $( | $( | _ | _ | 0 | 1 | PUNC | PUNC | _ | _ |
 +
 +The first sentence of the CoNLL 2009 test data:
 +
 +| 1 | Gegen | gegen | gegen | APPR | APPR | _ | _ | _ | _ | _ | _ | _ |
 +| 2 | eine | ein | ein | ART | ART | Acc<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | _ | _ | _ | _ | _ |
 +| 3 | Erweiterung | Erweiterung | Erweiterung | NN | NN | Acc<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | _ | _ | _ | _ | _ |
 +| 4 | ihrer | ihr | ihr | PPOSAT | PPOSAT | Gen<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 5 | Organisation | Organisation | Organisation | NN | NN | Gen<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | _ | _ | _ | _ | _ |
 +| 6 | zu | zu | zu | APPR | APPR | _ | _ | _ | _ | _ | _ | _ |
 +| 7 | einem | ein | ein | ART | ART | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 8 | sicherheitspolitischen | sicherheitspolitisch | sicherheitspolitisch | ADJA | ADJA | Pos<nowiki>|</nowiki>Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | Pos<nowiki>|</nowiki>*<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 9 | Forum | Forum | Forum | NN | NN | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | _ | _ | _ | _ | _ |
 +| 10 | sprachen | sprechen | sprechen | VVFIN | VVFIN | 3<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Past<nowiki>|</nowiki>Ind | *<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Past<nowiki>|</nowiki>Ind | _ | _ | _ | _ | Y |
 +| 11 | sich | sich | er<nowiki>|</nowiki>es<nowiki>|</nowiki>sie<nowiki>|</nowiki>Sie | PRF | PRF | 3<nowiki>|</nowiki>Acc<nowiki>|</nowiki>Pl | *<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 12 | die | der | d | ART | ART | Nom<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 13 | meisten | meister | meist | PIAT | PIAT | Nom<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 14 | Staaten | Staat | Staat | NN | NN | Nom<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Masc | *<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Masc | _ | _ | _ | _ | _ |
 +| 15 | beim | bei | beim | APPRART | APPRART | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 16 | Gipfeltreffen | Gipfeltreffen | Gipfeltreffen | NN | NN | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | *<nowiki>|</nowiki>*<nowiki>|</nowiki>Neut | _ | _ | _ | _ | _ |
 +| 17 | für | für | für | APPR | APPR | _ | _ | _ | _ | _ | _ | _ |
 +| 18 | Asiatisch-Pazifische | asiatisch-pazifisch | Asiatisch-Pazifische | ADJA | NN | Pos<nowiki>|</nowiki>Acc<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>*<nowiki>|</nowiki>* | _ | _ | _ | _ | _ |
 +| 19 | Wirtschaftskooperation | Wirtschaftskooperation | Wirtschaftskooperation | NN | NN | Acc<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | _ | _ | _ | _ | _ |
 +| 20 | ( | _ | ( | $( | $( | _ | _ | _ | _ | _ | _ | _ |
 +| 21 | Apec | Apec | _ | NE | NE | Nom<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Fem | _ | _ | _ | _ | _ | _ |
 +| 22 | ) | _ | ) | $( | $( | _ | _ | _ | _ | _ | _ | _ |
 +| 23 | in | in | in | APPR | APPR | _ | _ | _ | _ | _ | _ | _ |
 +| 24 | Osaka | Osaka | Osaka | NE | NE | Dat<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | *<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Neut | _ | _ | _ | _ | _ |
 +| 25 | aus | aus | aus | PTKVZ | PTKVZ | _ | _ | _ | _ | _ | _ | _ |
 +| 26 | . | _ | . | $. | $. | _ | _ | _ | _ | _ | _ | _ |
 +
 +==== Parsing ====
 +
 +TIGER is a mildly nonprojective treebank. 15875 of the 680,710 tokens in the CoNLL 2009 training+development datasets are attached nonprojectively (2.33%).
 +
 +The results of the CoNLL 2006 shared task are [[http://ilk.uvt.nl/conll/results.html|available online]]. They have been published in [[http://aclweb.org/anthology-new/W/W06/W06-2920.pdf|(Buchholz and Marsi, 2006)]]. The evaluation procedure was non-standard because it excluded punctuation tokens. These are the best results for German:
 +
 +^ Parser (Authors) ^ LAS ^ UAS ^
 +| MST (McDonald et al.) | 87.34 | 90.38 |
 +| Riedel et al. | 86.24 | 89.76 |
 +| Basis (O'Neil) | 85.36 | 89.16 |
 +| Malt (Nivre et al.) | 85.82 | 88.76 |
 +
 +The results of the CoNLL 2009 shared task are [[http://ufal.mff.cuni.cz/conll2009-st/results/results.php|available online]]. They have been published in [[http://aclweb.org/anthology/W/W09/W09-1201.pdf|(Hajič et al., 2009)]]. Unlabeled attachment score was not published. These are the best results for German:
 +
 +^ Parser (Authors) ^ LAS ^
 +| Bohnet | 87.48 |
 +| Merlo | 87.29 |
 +| Chen | 86.24 |
 +| Che | 86.19 |
 +
 +===== Greek (el) =====
 +
 +Greek Dependency Treebank (GDT)
 +
 +==== Versions ====
 +
 +  * CoNLL 2007
 +
 +==== Obtaining and License ====
 +
 +There does not seem to be any regular distribution channel for the Greek Dependency Treebank. The CoNLL 2007 version had a restricted license for the duration of the shared task only. Republication of the CoNLL version in LDC is planned but it has not happenned yet. In the meantime, one can ask Prokopis Prokopidis (prokopis (at) ilsp (dot) gr) about availability of the corpus.
 +
 +GDT was created by members of the [[http://www.ilsp.gr/|Institute for Language and Speech Processing]] (Ινστιτούτο Επεξεργασίας του Λόγου, ILSP/ΙΕΛ), Επιδαύρου & Αρτέμιδος 6, Παράδεισος Αμαρουσίου, GR-15125 Αθήνα, Greece.
 +
 +==== References ====
 +
 +  * Website
 +    * //no website dedicated to the treebank//
 +  * Data
 +    * //no separate citation//
 +  * Principal publications
 +    * Prokopis Prokopidis, Elina Desipri, Maria Koutsombogera, Harris Papageorgiou, Stelios Piperidis: [[http://www.ilsp.gr/homepages/prokopidis/documents/gdt_tlt2005.pdf|Theoretical and Practical Issues in the Construction of a Greek Dependency Corpus]] In: Montserrat Civit, Sandra Kübler, Ma. Antònia Martí (eds.), Proceedings of The Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005), pp. 149-160, Barcelona, Spain, 2005.
 +  * Documentation
 +    * Description of tags and feature values is provided in the ''doc/README'' file in the CoNLL 2007 data distribution.
 +
 +==== Domain ====
 +
 +Mixed (“GDT consists of randomly selected textual fragments and texts in three domains: politics (current affairs, manual transcripts and minutes of European parliamentary sessions), health, and travel.”)
 +
 +==== Size ====
 +
 +The CoNLL 2007 version contains 70223 tokens in 2902 sentences, yielding 24.20 tokens per sentence on average (CoNLL 2007 data split: 65419 tokens / 2705 sentences training, 4804 tokens / 197 sentences test).
 +
 +==== Inside ====
 +
 +The syntactic annotation style and the tagset for dependency relations (analytical functions) in GDT has been modeled after the [[http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/a-layer/html/index.html|Prague Dependency Treebank]].
  
 ==== Sample ==== ==== Sample ====
Line 1411: Line 1596:
 The first sentence of the CoNLL 2007 training data: The first sentence of the CoNLL 2007 training data:
  
-| 1 | L' el da num=s<nowiki>|</nowiki>gen=c ESPEC | _ | _ | +| 1 | PUNCT PUNCT _ | 10 | AuxG | _ | _ | 
-Ajuntament_de_Manresa Ajuntament_de_Manresa np | _ | 4 | SUJ | _ | _ | +| 2 | Τα | ο | At | AtDf | Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Nm | 3 Atr | _ | _ | 
-ha haver va num=s<nowiki>|</nowiki>per=3<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>ten=p AUX | _ | _ | +αντισώματα αντίσωμα No NoCm | Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Nm | 5 | Sb | _ | _ | 
-| 4 | posat_en_funcionament | posar_en_funcionament | v | vm | num=s<nowiki>|</nowiki>mod=p<nowiki>|</nowiki>gen=m | _ | _ | +| 4 | IgG | IgG | Rg | RgFwOr | _ | 3 | Atr | _ | _ | 
-tot tot di num=s<nowiki>|</nowiki>gen=m | 7 ESPEC | _ | _ | +είναι είμαι Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx 10 Obj_Co | _ | _ | 
-un_seguit_de un_seguit_de di num=p<nowiki>|</nowiki>gen=c DET | _ | _ | +σαν σαν Ad Ad Ba Adv | _ | _ | 
-mesures mesura nc num=p<nowiki>|</nowiki>gen=f CD | _ | _ | +μακροπρόθεσμη μακροπρόθεσμος Aj Aj Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 8 Atr | _ | _ | 
-| , | , | Fc | _ | 10 | PUNC | _ | _ | +μνήμη μνήμη No NoCm Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm 6 | Adv | _ | _ | 
-la el da num=s<nowiki>|</nowiki>gen=f 10 | ESPEC | _ | _ | +| , | , | PUNCT PUNCT | _ | 10 | AuxX | _ | _ | 
-10 majoria majoria nc num=s<nowiki>|</nowiki>gen=f | _ | _ | _ | +10 ενώ ενώ Cj CjCo 26 Coord | _ | _ | 
-11 informatives informatiu aq num=p<nowiki>|</nowiki>gen=f | 10 | | _ | _ | +11 το ο At AtDf Ne<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 12 | Atr | _ | _ | 
-12 Fc | _ | 10 PUNC | _ | _ | +| 12 | IgA | IgA | Rg | RgFwOr | _ | 15 | Sb | _ | _ | 
-13 que que pr num=n<nowiki>|</nowiki>gen=c | 14 | SUJ | _ | _ | +13 πιστεύεται πιστεύεται Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 10 | Obj_Co | _ | _ | 
-14 tenen tenir vm num=p<nowiki>|</nowiki>per=3<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>ten=p SF | _ | _ | +14 ότι ότι Cj CjSb | _ | 13 AuxC | _ | _ | 
-| 15 | com_a com_a sp for=s 14 CPRED | _ | _ | +15 είναι είμαι Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 14 | Sb | _ | _ | 
-16 finalitat finalitat nc num=s<nowiki>|</nowiki>gen=f 15 SN | _ | _ | +16 ένας ένας At AtId Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 18 | Atr | _ | _ | 
-17 minimitzar minimitzar vm mod=n 14 CD | _ | _ | +| 17 | συγκεκριμένος | συγκεκριμένος | Aj | Aj | Ba<nowiki>|</nowiki>Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 18 Atr | _ | _ | 
-18 els el da num=p<nowiki>|</nowiki>gen=m 19 ESPEC | _ | _ | +| 18 | δείκτης | δείκτης | No | NoCm | Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 15 | Pnom | 
-19 efectes efecte nc num=p<nowiki>|</nowiki>gen=m 17 SN | _ | _ | +19 για | για | AsPp AsPpSp | _ | 18 | AuxP | _ | _ | 
-20 de de sp for=s 19 SP | _ | _ | +20 πρόσφατες πρόσφατος Aj Aj Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 21 | Atr_Co | _ | _ | 
-21 la el da num=s<nowiki>|</nowiki>gen=f 22 ESPEC | _ | _ | +21 ή ή Cj CjCo 23 Coord | _ | _ | 
-22 vaga vaga nc num=s<nowiki>|</nowiki>gen=f 20 SN | _ | _ | +22 χρόνιες χρόνιος Aj Aj Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 21 Atr_Co | _ | _ | 
-23 | . | . | Fp | _ | PUNC | _ | _ |+23 λοιμώξεις λοίμωξη No NoCm Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 19 Atr | _ | _ | 
 +24 PUNCT PUNCT 10 AuxG | _ | _ | 
 +25 , | , | PUNCT | PUNCT | _ | 10 | AuxX | _ | _ | 
 +| 26 | εξηγεί εξηγώ Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Av<nowiki>|</nowiki>Xx | 0 | Pred | _ | _ | 
 +27 η ο At AtDf Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 28 Atr | _ | _ | 
 +28 | Δρ | Δρ | Rg | RgFwTr | _ | 26 | Sb | _ | _ | 
 +| 29 | Αρκάρι | Αρκάρι | No | NoCm | Ne<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 28 | Atr | _ | _ | 
 +| 30 | . | . | PUNCT PUNCT | _ | AuxK | _ | _ |
  
 The first sentence of the CoNLL 2007 test data: The first sentence of the CoNLL 2007 test data:
  
-| 1 | Tot_i_que tot_i_que cs | _ | 5 | SUBORD | _ | _ | +| 1 | Η ο At AtDf | Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 2 | Atr | _ | _ | 
-ahir ahir rg | _ | 5 | CC | _ | _ | +| 2 | Σίφνος | Σίφνος | No | NoPr | Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 3 | Sb | _ | _ | 
-| 3 | hi hi | pp | num=n<nowiki>|</nowiki>per=3<nowiki>|</nowiki>gen=MORF | _ | _ | +| 3 | φημίζεται | φημίζομαι | Vb | VbMn | Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 0 | Pred | _ | _ | 
-| 4 | va anar va num=s<nowiki>|</nowiki>per=3<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>ten=p | AUX | _ | _ | +| 4 | και | και | Cj | CjCo | _ | 5 | AuxY | _ | _ | 
-| 5 | haver haver va mod=15 AO | _ | _ | +για για AsPp AsPpSp | _ | 3 | AuxP | _ | _ | 
-| 6 | una un di num=s<nowiki>|</nowiki>gen=ESPEC | _ | _ | +| 6 | τα | ο | At | AtDf | Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 8 | Atr | _ | _ | 
-| 7 | reunió reunió nc num=s<nowiki>|</nowiki>gen=| 5 | CD | _ | _ | +| 7 | καταγάλανα | καταγάλανος | Aj | Aj | Ba<nowiki>|</nowiki>Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 8 | Atr | _ | _ | 
-de de sp for=SP | _ | _ | +| 8 | νερά | νερό | No | NoCm | Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 5 | Obj | _ | _ | 
-darrera darrer ao num=s<nowiki>|</nowiki>gen=10 SADJ | _ | _ | +9 | των | ο | At | AtDf | Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge | 11 | Atr | _ | _ | 
-10 hora hora nc num=s<nowiki>|</nowiki>gen=| 8 | SN | _ | _ | +| 10 | πανέμορφων | πανέμορφος | Aj | Aj | Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge | 11 | Atr | _ | _ | 
-11 | , | , | Fc | _ | PUNC | _ | _ | +| 11 | ακτών | ακτή | No | NoCm | Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge | 8 | Atr | _ | _ | 
-12 no no rn | _ | 15 MOD | _ | _ | +| 12 | της | μου | Pn | PnPo | Fe<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Ge<nowiki>|</nowiki>Xx | 11 | Atr | _ | _ | 
-13 es es p0 15 PASS | _ | _ | +| 13 | . | . | PUNCT | PUNCT | _ | 0 | AuxK | _ | _ | 
-14 va anar va num=s<nowiki>|</nowiki>per=3<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>ten=15 AUX | _ | _ | + 
-| 15 | aconseguir aconseguir vm mod=| _ | _ | +==== Parsing ==== 
-| 16 | acostar acostar vm mod=15 SUJ | _ | _ | + 
-| 17 | posicions posició nc num=p<nowiki>|</nowiki>gen=16 SN | _ | _ | +Nonprojectivities in GDT are not frequent. Only 823 of the 70223 tokens in the CoNLL 2007 version are attached nonprojectively (1.17%). 
-| 18 | Fc 23 PUNC | _ | _ | + 
-| 19 | de_manera_que de_manera_que cs 23 SUBORD | _ | _ | +The results of the CoNLL 2007 shared task are [[http://nextens.uvt.nl/depparse-wiki/AllScores|available online]]. They have been published in [[http://aclweb.org/anthology-new/D/D07/D07-1096.pdf|(Nivre et al., 2007)]]. The evaluation procedure was changed to include punctuation tokens. These are the best results for Greek: 
-| 20 | els el da num=p<nowiki>|</nowiki>gen=| 21 | ESPEC | _ | _ | + 
-| 21 | treballadors treballador nc num=p<nowiki>|</nowiki>gen=23 SUJ | _ | _ | +^ Parser (Authors) ^ LAS ^ UAS ^ 
-| 22 | han haver va num=p<nowiki>|</nowiki>per=3<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>ten=23 AUX | _ | _ | +| Nakagawa | 76.31 | 84.08 | 
-23 decidit decidir vm num=s<nowiki>|</nowiki>mod=p<nowiki>|</nowiki>gen=15 AO | _ | _ | +| Keith Hall et al. | 74.21 | 82.04 | 
-24 anar anar vm mod=23 CD | _ | _ | +| Carreras | 73.56 | 81.37 | 
-25 sp for=24 CREG | _ | _ | +| Malt (Nilsson et al.) | 74.65 | 81.22 | 
-26 la el da num=s<nowiki>|</nowiki>gen=27 ESPEC | _ | _ | +| Titov et al. | 73.52 | 81.20 | 
-27 vaga vaga nc num=s<nowiki>|</nowiki>gen=25 SN | _ | _ | +| Chen | 74.42 | 81.16 | 
-28 Fp | _ | 15 PUNC | _ | _ |+| Duan | 74.29 | 80.77 | 
 +| Attardi et al. | 73.92 | 80.75 | 
 +| Malt (J. Hall et al.) | 74.21 | 80.66 | 
 + 
 +The two Malt parser results of 2007 (single malt and blended) are described in [[http://aclweb.org/anthology-new/D/D07/D07-1097.pdf|(Hall et al., 2007)]] and the details about the parser configuration are described [[http://w3.msi.vxu.se/users/jha/conll07/|here]]. 
 + 
 +===== English (en) ===== 
 + 
 +[[http://www.cis.upenn.edu/~treebank/|Penn Treebank]] 
 + 
 +==== Versions ==== 
 + 
 +  * Penn Treebank 2 (1995) 
 +  * Penn Treebank (1999) 
 +  * CoNLL 2007 
 +  * CoNLL 2008 
 +  * CoNLL 2009 
 + 
 +==== Obtaining and License ==== 
 + 
 +The original Penn Treebank is distributed by the LDC under the catalogue number [[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC99T42|LDC99T42]]. It is free for LDC members 1999, price for non-members is unknown (contact LDC). The [[http://www.ldc.upenn.edu/Catalog/nonmem_agree/generic.license.html|license]] in short: 
 + 
 +  * non-commercial education and research usage 
 +  * no redistribution 
 +  * citation in publications not explicitly required but it is common decency 
 + 
 +The CoNLL 2007, 2008 and 2009 versions are also licensed by the LDC and LDC members can keep them after the shared task. Those who have not participated in the shared task may inquire at the LDC about the availability of the datasets. Their republication in LDC is planned but it has not happenned yet. 
 + 
 +The Penn Treebank was created by members of the [[http://www.cis.upenn.edu/|Department of Computer and Information Science]] (CIS), School of Engineering, University of Pennsylvania, Levine Hall, 3330 Walnut Street, Philadelphia, PA 19104-6309, USA. The constituents-to-dependencies CoNLL 2007 conversion of the treebank was prepared by Ryan McDonald. 
 + 
 +==== References ==== 
 + 
 +  * Website 
 +    * http://www.cis.upenn.edu/~treebank/ 
 +  * Data 
 +    * Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, Ann Taylor: //Treebank-3// ([[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC99T42|LDC99T42]]). Linguistic Data Consortium, Philadelphia, USA, 2001. ISBN 1-58563-163-9. 
 +  * Principal publications 
 +    * Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz: Building a large annotated corpus of English: the Penn Treebank. //Computational Linguistics,// 19(2):313-330. 1993. 
 +  * Documentation 
 +    * [[http://www.cis.upenn.edu/~treebank/tokenization.html|Tokenization]] 
 +    * Beatrice Santorini: [[ftp://ftp.cis.upenn.edu/pub/treebank/doc/tagguide.ps.gz|Part-of-Speech Tagging Guidelines for the Penn Treebank Project]], 3rd Revision, Philadelphia, USA, 1990. 
 +    * Ann Bies, Mark Ferguson, Karen Katz, Robert MacIntyre: [[ftp://ftp.cis.upenn.edu/pub/treebank/doc/manual/root.ps.gz|Bracketing Guidelines for Treebank II Style, Penn Treebank Project]], Philadelphia, USA, 1995. 
 +    * Robert MacIntyre: [[ftp://ftp.cis.upenn.edu/pub/treebank/doc/faq.cd2|NP Heads and Base NPs]] (Treebank FAQ) 
 +    * Richard Johansson, Pierre Nugues: [[http://dspace.utlib.ee/dspace/bitstream/handle/10062/2560/reg-Johansson-10.pdf;jsessionid=BB8432D9BAE4FCF9DD9BD746704E796F?sequence=1|Extended constituent-to-dependency conversion for English]]. In: Proceedings of the 16th Nordic Conference on Computational Linguistics (NODALIDA), pp. 105-112, Tartu, Estonia, 2007. 
 + 
 +==== Domain ==== 
 + 
 +Financial news from the Wall Street Journal (1989). The constituent-based Treebank-3 also contains parsed versions of ATIS-3 and of the Brown Corpus. Only WSJ texts have been converted to dependencies for the CoNLL shared tasks. 
 + 
 +==== Size ==== 
 + 
 +Size of CoNLL 2007 data was limited because some teams of CoNLL 2006 complained that they did not have enough time and resources to train the larger models. Sections 2-11 of the Wall Street Journal part of the treebank were used for training and a subset of section 23 was used for testing. 
 + 
 +^ Version ^ Train Sentences ^ Train Tokens ^ D-test Sentences ^ D-test Tokens ^ E-test Sentences ^ E-test Tokens ^ Total Sentences ^ Total Tokens ^ Sentence Length ^ 
 +CoNLL 2007 |  18577 |    446,573 |   214 |     5003 |        |          |  18791 |    451,576 |  24.03 | 
 +| CoNLL 2009 |  39279 |    958,167 |  1334 |    33368 |   2399 |    57676 |  43012 |  1,049,211 |  24.39 | 
 + 
 +==== Inside ==== 
 + 
 +CoNLL 2007: Many function tags were removed from the non-terminals in the phrase-structure representation. The phrase structures were converted to dependency structures using the procedure described in Richard Johansson, Pierre Nugues: [[http://dspace.utlib.ee/dspace/bitstream/handle/10062/2560/reg-Johansson-10.pdf;jsessionid=BB8432D9BAE4FCF9DD9BD746704E796F?sequence=1|Extended constituent-to-dependency conversion for English]]. In: Proceedings of the 16th Nordic Conference on Computational Linguistics (NODALIDA), pp. 105-112, Tartu, Estonia, 2007. 
 + 
 +PDT 1.0 is distributed in the [[::format-csts|CSTS format]]. PDT 2.0 uses the [[::format-pml|PML format]]. CoNLL 2006 and 2007 uses the [[:format-conll|CoNLL-X format]]; CoNLL 2009 format is slightly different (number and meaning of columns). Unlike the other formats, the CSTS format used the ISO-8859-2 character encoding. 
 + 
 +The CSTS format (PDT 0.5 and 1.0) contains morphological annotation (lemmas and tags) both manual and by two taggers. The CoNLL 2009 version contains manual and one automatic disambiguation. The official distribution of PDT 2.0 and the CoNLL 2006 and 2007 versions contain only manual morphology. 
 + 
 +The original PDT uses 15-character positional morphological tags. The CoNLL versions convert the tags to the two/three CoNLL columns, CPOS, POS and FEAT. In addition, the CoNLL versions contain the Sem feature, which is derived from the tags attached to lemma in PDT (see [[http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/m-layer/pdf/m-man-en.pdf|Hana and Zeman, 2005]]). 
 + 
 +See above for documentation of the morphological tags. All CoNLL distributions contain a README file with a brief description of the parts of speech and features. Use [[http://quest.ms.mff.cuni.cz/cgi-bin/interset/index.pl?tagset=cs::pdt|DZ Interset]] to inspect the PDT and the CoNLL tagsets. 
 + 
 +The guidelines for syntactic annotation are documented in the [[http://ufal.mff.cuni.cz/pdt2.0/doc/manuals/en/a-layer/html/index.html|PDT annotation manual]]. 
 + 
 +==== Sample ==== 
 + 
 +The first sentence of the PDT 1.0 training data: 
 + 
 +<code xml><csts lang=cs> 
 +<h> 
 +<source>Českomoravský profit</source> 
 +<markup> 
 +<mauth>js 
 +<mdate>1996-2000 
 +<mdesc>Manual analytical annotation 
 +</markup> 
 +<markup> 
 +<mauth>kk,lk 
 +<mdate>1996-2000 
 +<mdesc>Manual morphological annotation 
 +</markup> 
 +</h> 
 +<doc file="s/inf/j/1994/cmpr9406" id="001"> 
 +<a> 
 +<mod>
 +<txtype>inf 
 +<genre>mix 
 +<med>
 +<temp>1994 
 +<authname>
 +<opus>cmpr9406 
 +<id>001 
 +</a> 
 +<c> 
 +<p n=1> 
 +<s id="cmpr9406:001-p1s1"> 
 +<p n=2> 
 +<s id="cmpr9406:001-p2s1"> 
 +<f cap>Třikrát<l>třikrát`3<t>Cv-------------<MDl src="a">třikrát`3<MDt src="a">Cv-------------<MDl src="b">třikrát`3<MDt src="b">Cv-------------<A>Adv<r>1<g>
 +<f>rychlejší<l>rychlý<t>AAFS1----2A----<MDl src="a">rychlý<MDt src="a">AANS1----2A----<MDl src="b">rychlý<MDt src="b">AAFS1----2A----<A>ExD<r>2<g>
 +<f>než<l>než-2<t>J,-------------<MDl src="a">než-2<MDt src="a">J,-------------<MDl src="b">než-2<MDt src="b">J,-------------<A>AuxC<r>3<g>
 +<f>slovo<l>slovo<t>NNNS1-----A----<MDl src="a">slovo<MDt src="a">NNNS4-----A----<MDl src="b">slovo<MDt src="b">NNNS1-----A----<A>ExD<r>4<g>3</code> 
 + 
 +The first two sentences of the PDT 1.0 d-test data: 
 + 
 +<code xml><csts lang=cs> 
 +<h> 
 +<source>Lidové noviny</source> 
 +<markup> 
 +<mauth>zu 
 +<mdate>1996-2000 
 +<mdesc>Manual analytical annotation 
 +</markup> 
 +</h> 
 +<doc file="s/pub/nws/1994/ln94206" id="1"> 
 +<a> 
 +<mod>
 +<txtype>pub 
 +<genre>mix 
 +<med>nws 
 +<temp>1994 
 +<authname>
 +<opus>ln94206 
 +<id>
 +</a> 
 +<c> 
 +<p n=1> 
 +<s id="ln94206:1-p1s1"> 
 +<i>ti 
 +<f cap>Lidé<MDl src="a">člověk<MDt src="a">NNMP1-----A---1<MDl src="b">člověk<MDt src="b">NNMP1-----A---1<A>ExD<r>1<g>
 +<p n=2> 
 +<s id="ln94206:1-p2s1"> 
 +<f upper.abbr>ING<MDl src="a">Ing-1_:B_^(inženýr)<MDt src="a">NNMXX-----A---8<MDl src="b">Ing-1_:B_^(inženýr)<MDt src="b">NNMXX-----A---8<A>Atr<r>1<g>
 +<D> 
 +<d>.<MDl src="a">.<MDt src="a">Z:-------------<MDl src="b">.<MDt src="b">Z:-------------<A>AuxG<r>2<g>
 +<f upper>PETR<MDl src="a">Petr_;Y<MDt src="a">NNMS1-----A----<MDl src="b">Petr_;Y<MDt src="b">NNMS1-----A----<A>Atr<r>3<g>
 +<f upper>KARAS<MDl src="a">karas<MDt src="a">NNMS1-----A----<MDl src="b">karas<MDt src="b">NNMS1-----A----<A>Sb_Ap<r>4<g>11 
 +<D> 
 +<d>,<MDl src="a">,<MDt src="a">Z:-------------<MDl src="b">,<MDt src="b">Z:-------------<A>AuxX<r>5<g>
 +<f mixed>CSc<MDl src="a">CSc-1_:B_^(kandidát_věd)<MDt src="a">NNMXX-----A---8<MDl src="b">CSc-1_:B_^(kandidát_věd)<MDt src="b">NNMXX-----A---8<A>Atr<r>6<g>
 +<D> 
 +<d>.<MDl src="a">.<MDt src="a">Z:-------------<MDl src="b">.<MDt src="b">Z:-------------<A>AuxG<r>7<g>
 +<d>(<MDl src="a">(<MDt src="a">Z:-------------<MDl src="b">(<MDt src="b">Z:-------------<A>ExD<r>8<g>
 +<D> 
 +<f num>53<MDl src="a">53<MDt src="a">C=-------------<MDl src="b">53<MDt src="b">C=-------------<A>ExD_Pa<r>9<g>
 +<D> 
 +<d>)<MDl src="a">)<MDt src="a">Z:-------------<MDl src="b">)<MDt src="b">Z:-------------<A>ExD<r>10<g>
 +<D> 
 +<d>,<MDl src="a">,<MDt src="a">Z:-------------<MDl src="b">,<MDt src="b">Z:-------------<A>Apos<r>11<g>20 
 +<f>generální<MDl src="a">generální<MDt src="a">AAMS1----1A----<MDl src="b">generální<MDt src="b">AAMS1----1A----<A>Atr<r>12<g>13 
 +<f>ředitel<MDl src="a">ředitel<MDt src="a">NNMS1-----A----<MDl src="b">ředitel<MDt src="b">NNMS1-----A----<A>Sb_Co<r>13<g>15 
 +<f upper>ČEZ<MDl src="a">ČEZ-1_:B_;K_^(České_energetické_závody)<MDt src="a">NNIPX-----A---8<MDl src="b">ČEZ-1_:B_;K_^(České_energetické_závody)<MDt src="b">NNIPX-----A---8<A>Atr<r>14<g>13 
 +<f>a<MDl src="a">a-1<MDt src="a">J^-------------<MDl src="b">a-1<MDt src="b">J^-------------<A>Coord_Ap<r>15<g>11 
 +<f>předseda<MDl src="a">předseda<MDt src="a">NNMS1-----A----<MDl src="b">předseda<MDt src="b">NNMS1-----A----<A>Sb_Co<r>16<g>15 
 +<f>jeho<MDl src="a">jeho_^(přivlast.)<MDt src="a">PSXXXZS3-------<MDl src="b">jeho_^(přivlast.)<MDt src="b">PSXXXZS3-------<A>Atr<r>17<g>18 
 +<f>představenstva<MDl src="a">představenstvo<MDt src="a">NNNS2-----A----<MDl src="b">představenstvo<MDt src="b">NNNS2-----A----<A>Atr<r>18<g>16 
 +<D> 
 +<d>,<MDl src="a">,<MDt src="a">Z:-------------<MDl src="b">,<MDt src="b">Z:-------------<A>AuxX<r>19<g>11 
 +<f>je<MDl src="a">být<MDt src="a">VB-S---3P-AA---<MDl src="b">být<MDt src="b">VB-S---3P-AA---<A>Pred<r>20<g>
 +<f>absolventem<MDl src="a">absolvent<MDt src="a">NNMS7-----A----<MDl src="b">absolvent<MDt src="b">NNMS7-----A----<A>Pnom<r>21<g>20 
 +<f>elektrotechnické<MDl src="a">elektrotechnický<MDt src="a">AAFS2----1A----<MDl src="b">elektrotechnický<MDt src="b">AAFS2----1A----<A>Atr<r>22<g>23 
 +<f>fakulty<MDl src="a">fakulta<MDt src="a">NNFS2-----A----<MDl src="b">fakulta<MDt src="b">NNFS2-----A----<A>Atr_Co<r>23<g>25 
 +<f upper>ČVUT<MDl src="a">ČVUT-1_:B_;K_^(České_vysoké_učení_technické)<MDt src="a">NNNXX-----A---8<MDl src="b">ČVUT-1_:B_;K_^(České_vysoké_učení_technické)<MDt src="b">NNNXX-----A---8<A>Atr<r>24<g>23 
 +<f>a<MDl src="a">a-1<MDt src="a">J^-------------<MDl src="b">a-1<MDt src="b">J^-------------<A>Coord<r>25<g>21 
 +<f>postgraduálního<MDl src="a">postgraduální<MDt src="a">AANS2----1A----<MDl src="b">postgraduální<MDt src="b">AANS2----1A----<A>Atr<r>26<g>27 
 +<f>studia<MDl src="a">studium<MDt src="a">NNNS2-----A----<MDl src="b">studium<MDt src="b">NNNS2-----A----<A>Atr_Co<r>27<g>25 
 +<f>v<MDl src="a">v-1<MDt src="a">RR--6----------<MDl src="b">v-1<MDt src="b">RR--6----------<A>AuxP<r>28<g>29 
 +<f>oboru<MDl src="a">obor_^(lidské_činnosti)<MDt src="a">NNIS6-----A----<MDl src="b">obor_^(lidské_činnosti)<MDt src="b">NNIS6-----A----<A>AuxP<r>29<g>27 
 +<f>metod<MDl src="a">metoda<MDt src="a">NNFP2-----A----<MDl src="b">metoda<MDt src="b">NNFP2-----A----<A>Atr<r>30<g>29 
 +<f>operační<MDl src="a">operační<MDt src="a">AAFS2----1A----<MDl src="b">operační<MDt src="b">AAFS2----1A----<A>Atr<r>31<g>32 
 +<f>analýzy<MDl src="a">analýza<MDt src="a">NNFS2-----A----<MDl src="b">analýza<MDt src="b">NNFS2-----A----<A>Atr<r>32<g>30 
 +<D> 
 +<d>.<MDl src="a">.<MDt src="a">Z:-------------<MDl src="b">.<MDt src="b">Z:-------------<A>AuxK<r>33<g>0</code> 
 + 
 +The first sentence of the PDT 1.0 e-test data: 
 + 
 +<code xml><csts lang=cs> 
 +<h> 
 +<source>Lidové noviny</source> 
 +<markup> 
 +<mauth>zu 
 +<mdate>1996-2000 
 +<mdesc>Manual analytical annotation 
 +</markup> 
 +</h> 
 +<doc file="s/pub/nws/1994/ln94209" id="1"> 
 +<a> 
 +<mod>
 +<txtype>pub 
 +<genre>mix 
 +<med>nws 
 +<temp>1994 
 +<authname>
 +<opus>ln94209 
 +<id>
 +</a> 
 +<c> 
 +<p n=1> 
 +<s id="ln94209:1-p1s1"> 
 +<f cap>Přádelny<MDl src="a">přádelna<MDt src="a">NNFP1-----A----<MDl src="b">přádelna<MDt src="b">NNFP1-----A----<A>Sb<r>1<g>
 +<f>mají<MDl src="a">mít<MDt src="a">VB-P---3P-AA---<MDl src="b">mít<MDt src="b">VB-P---3P-AA---<A>Pred<r>2<g>
 +<f>dvojnásob<MDl src="a">dvojnásob<MDt src="a">Db-------------<MDl src="b">dvojnásob<MDt src="b">Db-------------<A>Obj<r>3<g>
 +<f>vad<MDl src="a">vada<MDt src="a">NNFP2-----A----<MDl src="b">vada<MDt src="b">NNFP2-----A----<A>Atr<r>4<g>3</code> 
 + 
 +Morphological annotation of the first amw training file of the PDT 2.0: 
 + 
 +<code xml><mdata xmlns="http://ufal.mff.cuni.cz/pdt/pml/"> 
 + <head> 
 +  <schema href="mdata_schema.xml" /> 
 +  <references> 
 +   <reffile id="w" name="wdata" href="cmpr9406_001.w.gz" /> 
 +  </references> 
 + </head> 
 + <meta> 
 +  <lang>cs</lang> 
 +  <annotation_info id="manual"> 
 +   <desc>Manual annotation</desc> 
 +  </annotation_info> 
 + </meta> 
 + <s id="m-cmpr9406-001-p2s1"> 
 +  <m id="m-cmpr9406-001-p2s1w1"> 
 +   <src.rf>manual</src.rf> 
 +   <w.rf>w#w-cmpr9406-001-p2s1w1</w.rf> 
 +   <form>Třikrát</form> 
 +   <lemma>třikrát`3</lemma> 
 +   <tag>Cv-------------</tag> 
 +  </m> 
 +  <m id="m-cmpr9406-001-p2s1w2"> 
 +   <src.rf>manual</src.rf> 
 +   <w.rf>w#w-cmpr9406-001-p2s1w2</w.rf> 
 +   <form>rychlejší</form> 
 +   <lemma>rychlý</lemma> 
 +   <tag>AAFS1----2A----</tag> 
 +  </m> 
 +  <m id="m-cmpr9406-001-p2s1w3"> 
 +   <src.rf>manual</src.rf> 
 +   <w.rf>w#w-cmpr9406-001-p2s1w3</w.rf> 
 +   <form>než</form> 
 +   <lemma>než-2</lemma> 
 +   <tag>J,-------------</tag> 
 +  </m> 
 +  <m id="m-cmpr9406-001-p2s1w4"> 
 +   <src.rf>manual</src.rf> 
 +   <w.rf>w#w-cmpr9406-001-p2s1w4</w.rf> 
 +   <form>slovo</form> 
 +   <lemma>slovo</lemma> 
 +   <tag>NNNS1-----A----</tag> 
 +  </m> 
 + </s></code> 
 + 
 +Analytical (surface-syntactic) annotation of the first amw training file of the PDT 2.0: 
 + 
 +<code xml><adata xmlns="http://ufal.mff.cuni.cz/pdt/pml/"> 
 + <head> 
 +  <schema href="adata_schema.xml" /> 
 +  <references> 
 +   <reffile id="m" name="mdata" href="cmpr9406_001.m.gz" /> 
 +   <reffile id="w" name="wdata" href="cmpr9406_001.w.gz" /> 
 +  </references> 
 + </head> 
 + <meta> 
 +  <annotation_info> 
 +   <desc>Manual annotation</desc> 
 +  </annotation_info> 
 + </meta> 
 + <trees> 
 +  <LM id="a-cmpr9406-001-p2s1"> 
 +   <s.rf>m#m-cmpr9406-001-p2s1</s.rf> 
 +   <ord>0</ord> 
 +   <children> 
 +    <LM id="a-cmpr9406-001-p2s1w2"> 
 +     <m.rf>m#m-cmpr9406-001-p2s1w2</m.rf> 
 +     <afun>ExD</afun> 
 +     <ord>2</ord> 
 +     <children> 
 +      <LM id="a-cmpr9406-001-p2s1w1"> 
 +       <m.rf>m#m-cmpr9406-001-p2s1w1</m.rf> 
 +       <afun>Adv</afun> 
 +       <ord>1</ord> 
 +      </LM> 
 +      <LM id="a-cmpr9406-001-p2s1w3"> 
 +       <m.rf>m#m-cmpr9406-001-p2s1w3</m.rf> 
 +       <afun>AuxC</afun> 
 +       <ord>3</ord> 
 +       <children> 
 +        <LM id="a-cmpr9406-001-p2s1w4"> 
 +         <m.rf>m#m-cmpr9406-001-p2s1w4</m.rf> 
 +         <afun>ExD</afun> 
 +         <ord>4</ord> 
 +        </LM> 
 +       </children> 
 +      </LM> 
 +     </children> 
 +    </LM> 
 +   </children> 
 +  </LM></code> 
 + 
 +The first two sentences of the CoNLL 2006 and 2007 training data: 
 + 
 +| 1 | Třikrát | třikrát`3 | C | v | _ | 2 | Adv | _ | _ | 
 +| 2 | rychlejší | rychlý | A | A | Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=2<nowiki>|</nowiki>Neg=A | 0 ExD | _ | _ | 
 +| 3 | než | než-2 | J | , | _ | 2 | AuxC | _ | _ | 
 +| 4 | slovo slovo Gen=N<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=A | | ExD | _ | _ | 
 +| |||||||||| 
 +| 1 | Faxu | fax | N | N | Gen=I<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=3<nowiki>|</nowiki>Neg=A | 2 | Obj | _ | _ | 
 +| 2 | škodí | škodit | V | B | Num=P<nowiki>|</nowiki>Per=3<nowiki>|</nowiki>Ten=P<nowiki>|</nowiki>Neg=A<nowiki>|</nowiki>Voi=A | 0 | Pred | _ | _ | 
 +| 3 | především především D | b | _ | 6 | AuxZ | _ | _ | 
 +| 4 | přetížené | přetížený | A | A | Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=A | 6 | Atr | _ | _ | 
 +| 5 | telefonní telefonní Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=A | 6 Atr | _ | _ | 
 +| 6 | linky linka Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=A | 2 Sb | _ | _ | 
 +| 7 | _ | 2 | AuxG | _ | _ | 
 + 
 +The first sentence of the CoNLL 2006 test data: 
 + 
 +| 1 | Podobně | podobně | D | g | Gra=1<nowiki>|</nowiki>Neg=| 5 | Adv | _ | _ | 
 +_ | 3 | AuxX | _ | _ | 
 +| 3 | myslím | myslit | V | B | Num=S<nowiki>|</nowiki>Per=1<nowiki>|</nowiki>Ten=P<nowiki>|</nowiki>Neg=A<nowiki>|</nowiki>Voi=A | 5 | Pred_Pa | _ | _ | 
 +_ | 3 | AuxX | _ | _ | 
 +| 5 | postupuje | postupovat | V | B | Num=S<nowiki>|</nowiki>Per=3<nowiki>|</nowiki>Ten=P<nowiki>|</nowiki>Neg=A<nowiki>|</nowiki>Voi=A | 0 Pred | _ | _ | 
 +většina většina Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=A | 5 | Sb | _ | _ | 
 +| 7 | českých | český | A | A | Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=2<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=A | 8 | Atr | _ | _ | 
 +8 | bank | banka | N | N | Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=2<nowiki>|</nowiki>Neg=A | 6 | Atr | _ | _ | 
 +| 9 | , | , | | _ | 11 AuxX | _ | _ | 
 +10 zejména zejména | _ | 12 AuxZ | _ | _ | 
 +11 v-1 Cas=6 AuxP | _ | _ | 
 +12 případech případ Gen=I<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=6<nowiki>|</nowiki>Neg=11 | Adv | _ | _ | 
 +| 13 | , | , | Z | : | _ | 17 | AuxX | _ | _ | 
 +| 14 | kdy | kdy | D | b | _ | 17 Adv | _ | _ | 
 +| 15 | by být Num=X<nowiki>|</nowiki>Per=3 17 | AuxV | _ | _ | 
 +| 16 | se se Num=X<nowiki>|</nowiki>Cas=4 | 18 AuxT | _ | _ | 
 +| 17 | mělo mít Gen=N<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Per=X<nowiki>|</nowiki>Ten=R<nowiki>|</nowiki>Neg=A<nowiki>|</nowiki>Voi=A | 12 Atr | _ | _ | 
 +| 18 | jednat jednat Neg=A 17 Obj | _ | _ | 
 +| 19 | o-1 Cas=4 18 AuxP | _ | _ | 
 +| 20 | větší velký Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Gra=2<nowiki>|</nowiki>Neg=A | 21 | Atr | _ | _ | 
 +| 21 | částky částka Gen=F<nowiki>|</nowiki>Num=P<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Neg=A | 19 Obj | _ | _ | 
 +| 22 | _ | 0 | AuxK | _ | _ | 
 + 
 +The first sentence of the CoNLL 2007 test data: 
 + 
 +| 1 | Proč | proč | D | b | _ | 2 | Adv | _ | _ | 
 +| 2 | mají | mít | V | B | Num=P<nowiki>|</nowiki>Per=3<nowiki>|</nowiki>Ten=P<nowiki>|</nowiki>Neg=A<nowiki>|</nowiki>Voi=A | 0 Pred | _ | _ | 
 +každý každý Gen=I<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=A | 4 | Atr | _ | _ | 
 +rok rok Gen=I<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Neg=A | 5 Adv | _ | _ | 
 +fasovat fasovat Neg=Obj | _ | _ | 
 +speciální speciální Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=A | 7 Atr | _ | _ | 
 +taxu taxa Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Neg=A | 5 | Obj | _ | _ | 
 +na na R | Cas=4 | 7 | AuxP | _ | _ | 
 +| 9 | oblečení | oblečení | N | N | Gen=N<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=4<nowiki>|</nowiki>Neg=A | 8 | AtrAdv | _ | _ | 
 +| 10 | ? | ? | Z | : | _ | 0 AuxK | _ | _ |
  
 The first sentence of the CoNLL 2009 training data: The first sentence of the CoNLL 2009 training data:
  
-| 1 | El el el postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | 2 | 2 | spec | spec | _ | _ | _ | _ | _ | _ | +| 1 | Celní celní celní SubPOS=A<nowiki>|</nowiki>Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=SubPOS=A<nowiki>|</nowiki>Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=| 2 | 2 | Atr Atr celní | _ | RSTR | _ | 
-| 2 | Tribunal_Suprem | Tribunal_Suprem | Tribunal_Suprem | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 7 | 7 | suj | suj | _ | _ | arg0-agt | _ | _ | _ | +unie unie unie SubPOS=N<nowiki>|</nowiki>Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=SubPOS=N<nowiki>|</nowiki>Gen=F<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=| 0 | 0 | ExD ExD | Y | unie | _ | _ | _ | 
-| 3 | ( | ( | ( | f | f | punct=bracket<nowiki>|</nowiki>punctenclose=open | punct=bracket<nowiki>|</nowiki>punctenclose=open | 4 | 4 | f | f | _ | _ | _ | _ | _ | _ | +| 3 | v | v | v | SubPOS=R<nowiki>|</nowiki>Cas=SubPOS=R<nowiki>|</nowiki>Cas=AuxP AuxP | _ | _ | _ | _ | _ | 
-| 4 | TS | TS | TS | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 2 | 2 | sn sn | _ | _ | _ | _ | +ohrožení ohrožení ohrožení SubPOS=N<nowiki>|</nowiki>Gen=N<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=6<nowiki>|</nowiki>Neg=SubPOS=N<nowiki>|</nowiki>Gen=N<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=6<nowiki>|</nowiki>Neg=| 3 | 3 | Atr Atr | Y | v-w3017f1 | _ | _ | _ |
-punct=bracket<nowiki>|</nowiki>punctenclose=close | punct=bracket<nowiki>|</nowiki>punctenclose=close | 4 | 4 | f | f | _ | _ | _ | _ | _ | _ | +
-| 6 | ha | haver | haver | v | v | postype=auxiliary<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present | postype=auxiliary<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present | 7 | 7 | v | v | _ | _ | _ | _ | _ | _ | +
-| 7 | confirmat | confirmar | confirmar | v | v | postype=main<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>mood=pastparticiple | postype=main<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>mood=pastparticiple | 0 | 0 | sentence sentence | Y | confirmar.a32 | _ | _ | _ | _ | +
-| 8 | la | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 9 | 9 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 9 | condemna | condemna | condemna | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 7 | 7 | cd | cd | _ | _ | arg1-pat | _ | _ | _ | +
-| 10 | a | a | a | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 9 | 9 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 11 | quatre | quatre | quatre | d | d | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | 12 | 12 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 12 | anys | any | any | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 10 | 10 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 13 | d' | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 12 | 12 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 14 | inhabilitació | inhabilitació | inhabilitació | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 13 | 13 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 15 | especial | especial | especial | a | a | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | 14 | 14 | s.a | s.a | _ | _ | _ | _ | _ | _ | +
-| 16 | i | i | i | c | c | postype=coordinating | postype=coordinating | 12 | 9 | coord | coord | _ | _ | _ | _ | _ | _ | +
-| 17 | una | un | un | d | d | postype=indefinite<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=numeral<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 18 | 18 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 18 | multa | multa | multa | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 12 | 9 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 19 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 18 | 18 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 20 | 3,6 | 3.6 | 3,6 | z | n | _ | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 21 | 21 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 21 | milions | milió | milió | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 19 | 19 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 22 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 21 | 21 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 23 | pessetes | pesseta | pesseta | z | n | postype=currency | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | 22 | 22 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 24 | per | per | per | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 9 | 9 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 25 | a | a | a | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 24 | 24 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 26 | quatre | quatre | quatre | d | d | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | 27 | 27 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 27 | veterinaris | veterinari | veterinari | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 25 | 25 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 28 | gironins | gironí | gironí | a | a | postype=qualificative<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=qualificative<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 27 | 27 | s.a | s.a | _ | _ | _ | _ | _ | _ | +
-| 29 | , | , | , | f | f | punct=comma | punct=comma | 30 | 30 | f | f | _ | _ | _ | _ | _ | _ | +
-| 30 | per | per | per | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 9 | 7 | sp | cc | _ | _ | _ | _ | _ | _ | +
-| 31 | haver | haver | haver | v | n | postype=auxiliary<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c<nowiki>|</nowiki>mood=infinitive | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | 33 | 33 | v | v | _ | _ | _ | _ | _ | _ | +
-| 32 | -se | ell | ell | p gen=c<nowiki>|</nowiki>num=c<nowiki>|</nowiki>person=3 gen=c<nowiki>|</nowiki>num=c<nowiki>|</nowiki>person=3 33 33 morfema.pronominal | morfema.pronominal | _ | _ | _ | _ | _ | _ | +
-33 beneficiat beneficiar beneficiat postype=main<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>mood=pastparticiple | postype=qualificative<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>posfunction=participle | 42 | 30 | | S | Y | beneficiar.a2 | _ | _ | _ | _ | +
-| 34 | dels | del | dels | s | s | postype=preposition<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p<nowiki>|</nowiki>contracted=yes postype=preposition<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p<nowiki>|</nowiki>contracted=yes | 33 | 33 | creg | creg | _ | _ | _ | arg1-null | _ | _ | +
-| 35 | càrrecs | càrrec | càrrec | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 34 | 34 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 36 | públics | públic | públic | a | a | postype=qualificative<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=qualificative<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 35 | 35 | s.a | s.a | _ | _ | _ | _ | _ | _ | +
-| 37 | que | que | que | p | p | postype=relative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=relative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 39 | 39 | cd | cd | _ | _ | _ | _ | arg1-pat | _ | +
-| 38 | _ | _ | _ | p | p | _ | _ | 39 | 39 | suj | suj | _ | _ | _ | _ | arg0-agt | _ | +
-| 39 | desenvolupaven | desenvolupar | desenvolupar | v | v | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=imperfect | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=imperfect | 35 | 35 | S | S | Y | desenvolupar.a2 | _ | _ | _ | _ | +
-| 40 | i | i | i | c | c | postype=coordinating | postype=coordinating | 42 | 33 | coord | coord | _ | _ | _ | _ | _ | _ | +
-| 41 | la_seva | el_seu | el_seu | d | d | postype=possessive<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3 | postype=possessive<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3 | 42 | 42 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 42 | relació | relació | relació | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 30 | 33 | sn | cd | _ | _ | _ | _ | _ | _ | +
-| 43 | amb | amb | amb | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 42 | 42 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 44 | les | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | 45 | 45 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 45 | empreses | empresa | empresa | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | 43 | 43 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 46 | càrniques | càrnic | càrnic | a | a | postype=qualificative<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=qualificative<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | 45 | 45 | s.a | s.a | _ | _ | _ | _ | _ | _ | +
-| 47 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 45 | 45 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 48 | la | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 49 | 49 | spec | spec | _ | _ | _ | _ | _ | _ | +
-| 49 | zona | zona | zona | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | 47 | 47 | sn | sn | _ | _ | _ | _ | _ | _ | +
-| 50 | en | en | en | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 42 | 42 | sp | sp | _ | _ | _ | _ | _ | _ | +
-| 51 | oferir | oferir | oferir | v | v | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c<nowiki>|</nowiki>mood=infinitive | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c<nowiki>|</nowiki>mood=infinitive | 50 | 50 | S | S | Y | oferir.a32 | _ | _ | _ | _ | +
-| 52 | -los | ell | ell | p | p | postype=personal<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p<nowiki>|</nowiki>person=3 | postype=personal<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p<nowiki>|</nowiki>person=3 | 51 | 51 | ci | ci | _ | _ | _ | _ | _ | arg2-ben | +
-| 53 | serveis | servei | servei | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 51 | 51 | cd | cd | _ | _ | _ | _ | _ | arg1-pat | +
-| 54 | particulars | particular | particular | a | a | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | 53 | 53 | s.a | s.a | _ | _ | _ | _ | _ | _ | +
-| 55 | . | . | . | f | f | punct=period | punct=period | 7 | 7 | f | f | _ | _ | _ | _ | _ | _ |+
  
 The first sentence of the CoNLL 2009 development data: The first sentence of the CoNLL 2009 development data:
  
-| 1 | Fundació_Privada_Fira_de_Manresa | Fundació_Privada_Fira_de_Manresa | Fundació_Privada_Fira_de_Manresa | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 3 | 3 | suj | suj | _ | _ | arg0-agt | +| 1 | <nowiki>|</nowiki> | <nowiki>|</nowiki> | <nowiki>|</nowiki>SubPOS=SubPOS=| 3 | ExD AuxG | _ | _ | _ | 
-| 2 | ha | haver | haver | v | v | postype=auxiliary<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present postype=auxiliary<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present 3 | 3 | v | v | _ | _ | _ | +Daňový daňový daňový SubPOS=A<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=SubPOS=A<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Gra=1<nowiki>|</nowiki>Neg=| 3 | 3 | Atr Atr daňový | _ | RSTR 
-fet fer fer postype=main<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>mood=pastparticiple | postype=main<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>mood=pastparticiple | 0 | 0 | sentence | sentence | Y | fer.a2 | _ | +poradce poradce poradce SubPOS=N<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=SubPOS=N<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=ExD ExD poradce | _ | _ | 
-| 4 | un | un | un | d | d postype=numeral<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=numeral<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | 5 | 5 | spec | spec | _ | _ | _ | +| <nowiki>|</nowiki> | <nowiki>|</nowiki> | <nowiki>|</nowiki>SubPOS=SubPOS=| 3 | AuxK AuxG | _ | _ | _ | _ |
-| 5 | balanç | balanç | balanç | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=| 3 | 3 | cd cd _ | arg1-pat | +
-| 6 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 5 | 5 | sp | sp | _ | _ | _ +
-l' el el postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | 8 | 8 | spec | spec | _ | _ | _ | +
-| 8 | activitat | activitat | activitat | n | n postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=sn sn _ | _ | +
-| 9 | del | del | del | s | s | postype=preposition<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>contracted=yes | postype=preposition<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s<nowiki>|</nowiki>contracted=yes | 8 | 8 | sp | sp | _ | _ | _ | +
-10 Palau_Firal | Palau_Firal | Palau_Firal | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 9 | 9 | sn | sn | _ | _ | _ | +
-| 11 | durant | durant | durant | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=| 3 | sp cc | _ | _ | _ | +
-| 12 | els | el | el | d | d | postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 15 | 15 | spec | spec | _ | _ | _ | +
-| 13 | primers | primer | primer | a | a | postype=ordinal<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=ordinal<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 12 | 12 | a | a | _ | _ | _ | +
-| 14 | cinc | cinc | cinc | d | d | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | postype=numeral<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=p | 12 | 12 | d | d | _ | _ | _ | +
-| 15 | mesos | mes | mes | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=p | 11 | 11 | sn | sn | _ | _ | _ | +
-| 16 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | 15 | 15 | sp | sp | _ | _ | _ | +
-| 17 | l' | el | el | d | d | postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | 18 | 18 | spec | spec | _ | _ | _ | +
-| 18 | any | any | any | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | 16 | 16 | sn | sn | _ | _ | _ | +
-| 19 | . | . | . | f | f | punct=period | punct=period | 3 | 3 | f | f | _ | _ | _ |+
  
 The first sentence of the CoNLL 2009 test data: The first sentence of the CoNLL 2009 test data:
  
-| 1 | El el el postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +| 1 | Názor názor názor SubPOS=N<nowiki>|</nowiki>Gen=I<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=SubPOS=N<nowiki>|</nowiki>Gen=I<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=1<nowiki>|</nowiki>Neg=| _ | _ | _ | _ | 
-| 2 | darrer | darrer | darrer | a | a postype=ordinal<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=ordinal<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=| _ | _ | _ | _ | +experta expert expert SubPOS=N<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=2<nowiki>|</nowiki>Neg=SubPOS=N<nowiki>|</nowiki>Gen=M<nowiki>|</nowiki>Num=S<nowiki>|</nowiki>Cas=2<nowiki>|</nowiki>Neg=| _ | _ | _ | _ | Y |
-número número número postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 4 | de | de | de | s | s postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 5 | l' | el | el | d | d | postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 6 | Observatori_del_Mercat_de_Treball_d'_Osona | Observatori_del_Mercat_de_Treball_d'_Osona | Observatori_del_Mercat_de_Treball_d'_Osona | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 7 | inclou | incloure | incloure | v | v | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present | postype=main<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s<nowiki>|</nowiki>person=3<nowiki>|</nowiki>mood=indicative<nowiki>|</nowiki>tense=present | _ | _ | _ | _ | Y +
-| 8 | un | un | un | d | d | postype=numeral<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=numeral<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 9 | informe | informe | informe | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 10 | especial | especial | especial | a | a | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 11 | sobre | sobre | sobre | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 12 | la | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 13 | contractació | contractació | contractació | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 14 | a_través_de | a_través_de | a_través_de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 15 | les | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | _ | _ | _ | _ | _ | +
-| 16 | empreses | empresa | empresa | n | n | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=common<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | _ | _ | _ | _ | _ | +
-| 17 | de | de | de | s | s | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=preposition<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 18 | treball | treball | treball | n | n | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | postype=common<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 19 | temporal | temporal | temporal | a | a | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | postype=qualificative<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=s | _ | _ | _ | _ | _ | +
-| 20 | , | , | , | f | f | punct=comma | punct=comma | _ | _ | _ | _ | _ | +
-| 21 | les | el | el | d | d | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | postype=article<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>num=p | _ | _ | _ | _ | _ | +
-| 22 | ETT | ETT | ETT | n | n | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | postype=proper<nowiki>|</nowiki>gen=c<nowiki>|</nowiki>num=c | _ | _ | _ | _ | _ | +
-| 23 | . | . | . | f | f | punct=period | punct=period | _ | _ | _ | _ | _ |+
  
 ==== Parsing ==== ==== Parsing ====
  
-Nonprojectivities in AnCora-CA are very rareOnly 487 of the 435,860 tokens in the CoNLL 2007 version are attached nonprojectively (0.11%). In the CoNLL 2009 versionthere are no nonprojectivities at all.+PDT is a mildly nonprojective treebank8351 of the 437,020 tokens in the CoNLL 2007 version are attached nonprojectively (1.91%). 
 + 
 +There is an [[http://ufal.mff.cuni.cz/czech-parsing/|online summary]] of known results in Czech parsing. 
 + 
 +The results of the CoNLL 2006 shared task are [[http://ilk.uvt.nl/conll/results.html|available online]]. They have been published in [[http://aclweb.org/anthology-new/W/W06/W06-2920.pdf|(Buchholz and Marsi2006)]]. The evaluation procedure was non-standard because it excluded punctuation tokens. These are the best results for Czech: 
 + 
 +^ Parser (Authors) ^ LAS ^ UAS ^ 
 +| MST (McDonald et al.) | 80.18 | 87.30 | 
 +| Basis (O'Neil) | 76.60 | 85.58 | 
 +| Malt (Nivre et al.) | 78.42 | 84.80 | 
 +| Nara (Yuchang Cheng) | 76.24 | 83.40 |
  
-The results of the CoNLL 2007 shared task are [[http://nextens.uvt.nl/depparse-wiki/AllScores|available online]]. They have been published in [[http://aclweb.org/anthology-new/D/D07/D07-1096.pdf|(Nivre et al., 2007)]]. The evaluation procedure was changed to include punctuation tokens. These are the best results for Catalan:+The results of the CoNLL 2007 shared task are [[http://nextens.uvt.nl/depparse-wiki/AllScores|available online]]. They have been published in [[http://aclweb.org/anthology-new/D/D07/D07-1096.pdf|(Nivre et al., 2007)]]. The evaluation procedure was changed to include punctuation tokens. These are the best results for Czech:
  
 ^ Parser (Authors) ^ LAS ^ UAS ^ ^ Parser (Authors) ^ LAS ^ UAS ^
-Titov et al. 87.40 93.40 +Nakagawa 80.19 86.28 
-Sagae 88.16 93.34 +Carreras 78.60 85.16 
-Malt (Nilsson et al.88.70 93.12 +Titov et al. | 77.94 84.19 
-Nakagawa 87.90 92.86 +Malt (Nilsson et al.) 77.98 83.59 
-Carreras 87.60 92.46 +Attardi et al. 77.37 83.40 
-| Malt (Hall et al.) | 87.74 92.20 |+| Malt (Hall et al.) | 77.22 82.35 |
  
 The two Malt parser results of 2007 (single malt and blended) are described in [[http://aclweb.org/anthology-new/D/D07/D07-1097.pdf|(Hall et al., 2007)]] and the details about the parser configuration are described [[http://w3.msi.vxu.se/users/jha/conll07/|here]]. The two Malt parser results of 2007 (single malt and blended) are described in [[http://aclweb.org/anthology-new/D/D07/D07-1097.pdf|(Hall et al., 2007)]] and the details about the parser configuration are described [[http://w3.msi.vxu.se/users/jha/conll07/|here]].
  
-The results of the CoNLL 2009 shared task are [[http://ufal.mff.cuni.cz/conll2009-st/results/results.php|available online]]. They have been published in [[http://aclweb.org/anthology/W/W09/W09-1201.pdf|(Hajič et al., 2009)]]. Unlabeled attachment score was not published. These are the best results for Catalan:+The results of the CoNLL 2009 shared task are [[http://ufal.mff.cuni.cz/conll2009-st/results/results.php|available online]]. They have been published in [[http://aclweb.org/anthology/W/W09/W09-1201.pdf|(Hajič et al., 2009)]]. Unlabeled attachment score was not published. These are the best results for Czech:
  
 ^ Parser (Authors) ^ LAS ^ ^ Parser (Authors) ^ LAS ^
-| Merlo | 87.86 | +| Merlo (Gesmundo et al.80.38 
-| Che | 86.56 +| Bohnet | 80.11 
-| Bohnet | 86.35 +Che et al. 80.01 |
-Chen 85.88 |+
  

[ Back to the navigation ] [ Back to the content ]