[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:treebanks:eu [2011/11/29 10:25]
zeman Inside.
user:zeman:treebanks:eu [2011/11/29 11:14]
zeman Parsing results.
Line 5: Line 5:
 ==== Versions ==== ==== Versions ====
  
-  * CoNLL 2007+  * CoNLL 2007 (BDT-I)
   * BDT-II (obtained per e-mail in 2011)   * BDT-II (obtained per e-mail in 2011)
  
Line 43: Line 43:
  
 ==== Inside ==== ==== Inside ====
 +
 +Both versions (CoNLL 2007 and BDT-II) are in the CoNLL 2006/2007 format.
  
 Part of speech tag description (obtained per e-mail from Koldo Gojenola, thanks!): Part of speech tag description (obtained per e-mail from Koldo Gojenola, thanks!):
Line 99: Line 101:
  
 The syntactic guidelines (structure and labels) are described in Spanish in this [[http://ixa.si.ehu.es/Ixa/Argitalpenak/Barne_txostenak/1068549887/publikoak/guia.pdf|technical report]]. See Appendix 3 for some lists of tags. The syntactic guidelines (structure and labels) are described in Spanish in this [[http://ixa.si.ehu.es/Ixa/Argitalpenak/Barne_txostenak/1068549887/publikoak/guia.pdf|technical report]]. See Appendix 3 for some lists of tags.
 +
 +Multi-word expressions have been collapsed into one token, using underscore as the joining character (e.g. Espainia_Poliziak, iduri_zait).
  
 ==== Sample ==== ==== Sample ====
Line 104: Line 108:
 The first sentence of the CoNLL 2007 training data: The first sentence of the CoNLL 2007 training data:
  
-| 1 | PUNCT PUNCT _ | 10 | AuxG | _ | _ | +| 1 | espainiako_poliziak Espainia_Poliziak IZE IZE_LIB PLU-<nowiki>|</nowiki>ENTI_LOC ncsubj | _ | _ | 
-| 2 | Τα | ο | At | AtDf | Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Nm | 3 Atr | _ | _ | +hiru hiru DET DET_DZH NMGP detmod | _ | _ | 
-αντισώματα αντίσωμα No NoCm Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Nm | 5 Sb | _ | _ | +| 3 | gazte gazte IZE IZE_ARR ABS<nowiki>|</nowiki>MG ncobj | _ | _ | 
-| 4 | IgG | IgG | Rg | RgFwOr | _ | 3 | Atr | +atxilotu atxilotu ADI ADI_SIN PART<nowiki>|</nowiki>BURU | 8 | lot | _ | _ | 
-| 5 | είναι | είμαι | Vb | VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 10 | Obj_Co | _ | _ | +ditu *edun ADL ADL A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>NK_HARK auxmod | _ | _ | 
-σαν σαν Ad Ad Ba | 5 | Adv | _ | _ | +atarrabian Atarrabia IZE IZE_LIB PLU-<nowiki>|</nowiki>INE<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC ncmod | _ | _ | 
-| 7 | μακροπρόθεσμη | μακροπρόθεσμος | Aj | Aj | Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 8 | Atr | _ | _ | +PUNC PUNC_KOMA | _ | PUNC | _ | _ | 
-μνήμη μνήμη No NoCm Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm Adv | _ | _ | +eta eta LOT LOT_JNT ROOT | _ | _ | 
-, | PUNCT | PUNCT | _ | 10 | AuxX | _ | _ | +madrilera Madril IZE IZE_LIB PLU-<nowiki>|</nowiki>ALA<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | 10 | ncmod | _ | _ | 
-| 10 | ενώ | ενώ | Cj | CjCo | _ | 26 | Coord | _ | _ | +| 10 | eraman eraman ADI ADI_SIN PART<nowiki>|</nowiki>BURU lot | _ | _ | 
-| 11 | το | ο At AtDf Ne<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 12 | Atr | _ | _ | +11 ditu *edun ADL ADL A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>NK_HARK 10 auxmod | _ | _ | 
-| 12 | IgA | IgA | Rg | RgFwOr | _ | 15 | Sb | _ | _ | +12 | . | . | PUNC PUNC_PUNC | _ | 11 PUNC | _ | _ |
-| 13 | πιστεύεται | πιστεύεται | Vb | VbMn | Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 10 | Obj_Co | _ | _ | +
-14 ότι ότι Cj CjSb | _ | 13 AuxC | _ | _ | +
-15 είναι είμαι Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx | 14 Sb | _ | _ | +
-16 ένας ένας At AtId Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 18 | Atr | _ | _ | +
-| 17 | συγκεκριμένος | συγκεκριμένος | Aj | Aj | Ba<nowiki>|</nowiki>Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 18 | Atr | _ | _ | +
-| 18 | δείκτης | δείκτης | No | NoCm | Ma<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 15 | Pnom | _ | _ | +
-| 19 | για | για | AsPp | AsPpSp | _ | 18 | AuxP | _ | _ | +
-| 20 | πρόσφατες | πρόσφατος | Aj | Aj | Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 21 | Atr_Co | _ | _ | +
-| 21 | ή | ή | Cj | CjCo | _ | 23 | Coord | _ | _ | +
-| 22 | χρόνιες | χρόνιος | Aj | Aj | Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 21 | Atr_Co | _ | _ | +
-| 23 | λοιμώξεις | λοίμωξη | No | NoCm | Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 19 | Atr | _ | _ | +
-| 24 | " | " | PUNCT | PUNCT | _ | 10 | AuxG | _ | _ | +
-| 25 | , | , | PUNCT | PUNCT | _ | 10 | AuxX _ | +
-| 26 | εξηγεί | εξηγώ | Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Av<nowiki>|</nowiki>Xx | 0 | Pred | _ | _ | +
-27 η ο At AtDf Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm 28 Atr | _ | _ | +
-28 | Δρ | Δρ | Rg | RgFwTr | _ | 26 | Sb | _ | _ | +
-| 29 | Αρκάρι | Αρκάρι | No | NoCm | Ne<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 28 | Atr | _ | _ | +
-| 30 | . | . | PUNCT PUNCT | _ | AuxK | _ | _ |+
  
 The first sentence of the CoNLL 2007 test data: The first sentence of the CoNLL 2007 test data:
  
-| 1 | Η ο At AtDf Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm | 2 | Atr | _ | _ | +| 1 | epaileek epaile IZE IZE_ARR BIZ+<nowiki>|</nowiki>ERG<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | 
-Σίφνος Σίφνος No NoPr Fe<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Nm Sb | _ | _ | +| 2 | diote | esan | ADT | ADT | PNT<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HURA<nowiki>|</nowiki>NK_HAIEK-K | 
-| 3 | φημίζεται φημίζομαι Vb VbMn Id<nowiki>|</nowiki>Pr<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Xx<nowiki>|</nowiki>Ip<nowiki>|</nowiki>Pv<nowiki>|</nowiki>Xx Pred | _ | _ | +| 3 | eaeko | EAE | IZE | IZE_LIB | SIG<nowiki>|</nowiki>GEL<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | 
-| 4 | και και Cj CjCo | _ | AuxY | _ | _ | +| 4 | parlamentarioek | parlamentario | ADJ | ADJ_ARR | IZAUR-<nowiki>|</nowiki>ERG<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | 
-για για AsPp AsPpSp | _ | 3 | AuxP | _ | _ | +| 5 | eaetik_kanpo | EAE | SIG | SIG- | DEK<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>DEK<nowiki>|</nowiki>ABL_kanpo_ABS<nowiki>|</nowiki>ENTI_LOC<nowiki>|</nowiki>POS | 
-τα ο At AtDf Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac Atr | _ | _ | +| 6 | eginiko | egin | ADI | ADI_SIN | PART<nowiki>|</nowiki>GEL | 
-| 7 | καταγάλανα καταγάλανος Aj Aj Ba<nowiki>|</nowiki>Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 8 | Atr | _ | _ | +| 7 | delituak | delitu | IZE | IZE_ARR | BIZ-<nowiki>|</nowiki>ABS<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | 
-| 8 | νερά νερό No NoCm Ne<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ac | 5 | Obj | _ | _ | +| 8 | ikertzea | ikertu | ADI | ADI_SIN | ADIZE<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>ABS | 
-| 9 | των ο At AtDf Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge | 11 | Atr | _ | _ | +| 9 | eta | eta | LOT | LOT_JNT | - | 
-| 10 | πανέμορφων πανέμορφος Aj Aj Ba<nowiki>|</nowiki>Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge 11 Atr | _ | _ | +| 10 | epaitzea | epaitu | ADI | ADI_SIN | ADIZE<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>ABS | 
-| 11 | ακτών ακτή No NoCm Fe<nowiki>|</nowiki>Pl<nowiki>|</nowiki>Ge Atr | _ | _ | +| 11 | auzitegi_gorenari | auzitegi_gora | ADJ | ADJ_IZO | DEK<nowiki>|</nowiki>GEN<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>DEK<nowiki>|</nowiki>DAT<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | 
-12 της μου Pn PnPo Fe<nowiki>|</nowiki>03<nowiki>|</nowiki>Sg<nowiki>|</nowiki>Ge<nowiki>|</nowiki>Xx 11 Atr | _ | _ | +| 12 | dagokiola | egon | ADT | ADT | PNT<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HURA<nowiki>|</nowiki>NI_HARI | 
-13 | . | . | PUNCT PUNCT | _ | AuxK | _ | _ |+| 13 | , | , | PUNC | PUNC_KOMA | _ 
 +| 14 | baina | baina | LOT | LOT_JNT | AURK | 
 +| 15 | atzerrian | atzerri | IZE | IZE_ARR | INE<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM | 
 +| 16 | izaniko | izan | ADI | ADI_SIN | PART<nowiki>|</nowiki>GEL | 
 +| 17 | kontaktu | kontaktu | IZE | IZE_ARR | _ | 
 +18 horiek horiek DET DET_ERKARR ABS<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | 
 +| 19 | ezin_direla | ezin_izan | ADI | ADI_ADK | PNT<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>MWCorrect | 
 +| 20 | delitutzat | delitu | IZE | IZE_ARR | BIZ-<nowiki>|</nowiki>PRO<nowiki>|</nowiki>MG | 
 +| 21 | hartu | hartu | ADI | ADI_SIN | PART | 
 +| 22 | . | . | PUNC | PUNC_PUNC | _ | 
 + 
 +The first sentence of the BDT-II training data: 
 + 
 +| 1 | Estatu_Batuetako_DEAko | Estatu_Batuak_DEA | IZE | LIB | PLU:+<nowiki>|</nowiki>IZAUR:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>MW:B<nowiki>|</nowiki>ENT:Erakundea | 2 | ncmod | _ | _ | 
 +| 2 | buru | buru | IZE | ARR | _ | 4 ncsubj | _ | _ | 
 +| 3 | ohiak ohi ADJ ARR IZAUR:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | 
 +| 4 | aztertuko | aztertu | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:GERO | 0 | ROOT | _ | _ | 
 +| 5 | du | *edun | ADL | ADL | MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 4 | auxmod | _ | _ | 
 +| 6 | RUCen | RUC | IZE | IZB | MTKAT:SIG<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Erakundea | 7 ncmod | _ | _ | 
 +| 7 | erreforma | erreforma | IZE | ARR | KAS:ABS<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:| 4 | ncobj _ | _ | 
 +| 8 | . | . PUNT_MARKA PUNT_PUNT | _ | PUNC | _ | _ | 
 + 
 +The first sentence of the BDT-II development data: 
 + 
 +Irakaskuntzan irakaskuntza IZE ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | 
 +| 2 | jardun | jardun | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:BURU | 0 | ROOT | _ | _ | 
 +| 3 | zuen | *edun | ADL | ADL | MDN:B1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 2 | auxmod | _ | _ | 
 +Miel Miel IZE IZB PLU:-<nowiki>|</nowiki>ENT:Pertsona | 5 | entios | _ | _ | 
 +| 5 | Anjel_Elustondok | Anjel_Elustondo | IZE | IZB | PLU:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Pertsona | 2 | ncsubj | _ | _ | 
 +| 6 | 1980 | 1980 | IZE | ZKI | _ | 7 ncmod | _ | _ | 
 +| 7 | urtetik urte IZE ARR BIZ:-<nowiki>|</nowiki>KAS:ABL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | 
 +| 8 | 1992ra | 1992 | IZE | ZKI | KAS:ALA<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | 
 +| 9 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 8 | PUNC _ | 
 +| 10 | hauetatik | hauek | DET ERKARR KAS:ABL<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | 
 +| 11 | hamar | hamar | DET | DZH | NMG:P | 12 | detmod | _ | _ | 
 +| 12 | urtez | urte | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INS<nowiki>|</nowiki>MUG:MG | 16 | lot | _ | _ | 
 +| 13 | Azpeitiko | Azpeitia | IZE | LIB | PLU:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 14 | ncmod | _ | _ | 
 +| 14 | ikastolan | ikastola | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | 
 +| 15 | irakasle | irakasle | IZE | ARR | KAS:ABS<nowiki>|</nowiki>MUG:MG | 16 | ncpred | _ | _ | 
 +| 16 | eta | eta | LOT | JNT | ERL:EMEN | 8 | aponcmod | _ | _ | 
 +| 17 | beste | beste | DET | DZG | _ | 18 | detmod | _ | _ | 
 +| 18 | biak | bi | IZE | ZKI | KAS:ABS<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 16 | lot | _ | _ | 
 +| 19 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 18 | PUNC | _ | _ | 
 +| 20 | Arabako | Araba | IZE | LIB | PLU:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 21 | ncmod | _ | _ | 
 +| 21 | ikastolen | ikastola | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 22 | ncmod | _ | _ | 
 +| 22 | elkartean | elkarte | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | 
 +| 23 | . | . | PUNT_MARKA | PUNT_PUNT | _ | 22 | PUNC | _ | _ | 
 + 
 +The first sentence of the BDT-II test data: 
 + 
 +| 1 | Hegoaldean | hegoalde | IZE | ARR | KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | 
 +| 2 | iduri_zait | iduri_izan | ADI | ADK | ASP:PNT<nowiki>|</nowiki>MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORI:NIRI<nowiki>|</nowiki>MW:B | 0 | ROOT | _ | _ | 
 +| 3 | euskararen | euskara | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 4 | ncmod | _ | _ | 
 +| 4 | mundu | mundu | IZE | ARR | BIZ:- | 7 | ncsubj | _ | _ | 
 +| 5 | hau | hau | DET | ERKARR | KAS:ABS<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 4 | detmod | _ | _ | 
 +| 6 | adi-adi | adi-adi | ADB | ARR | _ | 7 | ncmod | _ | _ | 
 +| 7 | dagola | egon | ADT | ADT | ASP:PNT<nowiki>|</nowiki>ERL:KONPL<nowiki>|</nowiki>MDN:A3<nowiki>|</nowiki>NOR:HURA | 2 | ccomp_obj | _ | _ | 
 +| 8 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 7 | PUNC | _ | _ | 
 +| 9 | Euskaltzaindiak Euskaltzaindia IZE LIB PLU:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 11 | ncsubj | _ | _ | 
 +| 10 | zer zer DET NOLGAL NMG:MG<nowiki>|</nowiki>KAS:ABS<nowiki>|</nowiki>MUG:MG | 11 | ncobj | _ | _ | 
 +| 11 | erranen | erran | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:GERO 13 menos | _ | _ | 
 +| 12 | duen | *edun | ADL | ADL | ERL:ZHG<nowiki>|</nowiki>MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 11 | auxmod _ | _ | 
 +| 13 | zain | zain | ADB | ARR | _ | 7 | cmod | _ | _ | 
 +| 14 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 13 | PUNC | _ | _ | 
 +| 15 | haren | hura DET ERKARR KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M 16 ncmod | _ | _ | 
 +16 arauen arau IZE ARR KAS:ABS<nowiki>|</nowiki>MUG:MG | 18 | ncmod | _ | _ | 
 +| 17 | berehala | berehala | ADB | ARR | _ | 18 | ncmod | _ | _ | 
 +| 18 | betetzeko | bete | ADI | SIN | ADM:ADIZE<nowiki>|</nowiki>ERL:HELB<nowiki>|</nowiki>KAS:ABS<nowiki>|</nowiki>MUG:MG xmod | _ | _ | 
 +19 | . | . | PUNT_MARKA PUNT_PUNT | _ | 18 PUNC | _ | _ |
  
 ==== Parsing ==== ==== Parsing ====
  
-Nonprojectivities in GDT are not frequentOnly 823 of the 70223 tokens in the CoNLL 2007 version are attached nonprojectively (1.17%).+BDT is a mildly nonprojective treebank1925 of the 151,604 tokens of combined BDT-II training and test sets are attached nonprojectively (1.27%).
  
 The results of the CoNLL 2007 shared task are [[http://nextens.uvt.nl/depparse-wiki/AllScores|available online]]. They have been published in [[http://aclweb.org/anthology-new/D/D07/D07-1096.pdf|(Nivre et al., 2007)]]. The evaluation procedure was changed to include punctuation tokens. These are the best results for Greek: The results of the CoNLL 2007 shared task are [[http://nextens.uvt.nl/depparse-wiki/AllScores|available online]]. They have been published in [[http://aclweb.org/anthology-new/D/D07/D07-1096.pdf|(Nivre et al., 2007)]]. The evaluation procedure was changed to include punctuation tokens. These are the best results for Greek:
  
 ^ Parser (Authors) ^ LAS ^ UAS ^ ^ Parser (Authors) ^ LAS ^ UAS ^
-| Nakagawa | 76.31 | 84.08 | +| Malt (Nilsson et al.) | 76.94 82.84 
-| Keith Hall et al. | 74.21 | 82.04 | +| Titov et al. | 75.49 | 81.93 
-| Carreras | 73.56 | 81.37 | +Sagae | 74.64 | 81.19 
-| Malt (Nilsson et al.) | 74.65 81.22 +Carreras 75.75 81.11 
-| Titov et al. | 73.52 | 81.20 +Nakagawa 72.56 81.04 
-Chen | 74.42 | 81.16 +| Malt (J. Hall et al.) | 74.99 | 80.61 | 
-Duan 74.29 80.77 +| Johansson et al. | 75.08 | 80.43 |
-Attardi et al. 73.92 80.75 +
-| Malt (J. Hall et al.) | 74.21 | 80.66 |+
  
 The two Malt parser results of 2007 (single malt and blended) are described in [[http://aclweb.org/anthology-new/D/D07/D07-1097.pdf|(Hall et al., 2007)]] and the details about the parser configuration are described [[http://w3.msi.vxu.se/users/jha/conll07/|here]]. The two Malt parser results of 2007 (single malt and blended) are described in [[http://aclweb.org/anthology-new/D/D07/D07-1097.pdf|(Hall et al., 2007)]] and the details about the parser configuration are described [[http://w3.msi.vxu.se/users/jha/conll07/|here]].
  
 +Parsing results on BDT-II have been published in Kepa Bengoetxea, Koldo Gojenola: [[http://aclweb.org/anthology-new/W/W10/W10-1404.pdf|Application of Different Techniques to Dependency Parsing of Basque]]. In: Proceedings of the First Workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL 2010), NAACL Workshop, Los Angeles, California, USA, 2010. They report only Labeled Attachment Score (LAS) and their best system achieved LAS = 78.98%.

[ Back to the navigation ] [ Back to the content ]