Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
user:zeman:treebanks:hu [2011/12/13 13:04] zeman Size. |
user:zeman:treebanks:hu [2011/12/13 13:32] zeman Inside. |
==== Inside ==== | ==== Inside ==== |
| |
Both versions (CoNLL 2007 and BDT-II) are in the CoNLL 2006/2007 format. | The original Szeged Treebank is a phrase-based treebank and it is distributed in XML-based, TEI-compliant format. The CoNLL 2007 version is dependency-based (i.e. the head of each phrase was identified), distributed in the CoNLL 2006/2007 format. |
| |
The syntactic guidelines (structure and labels) are described in Spanish in this [[http://ixa.si.ehu.es/Ixa/Argitalpenak/Barne_txostenak/1068549887/publikoak/guia.pdf|technical report]]. See Appendix 3 for some lists of tags. | Morphological annotation includes lemmas. Morphosyntactic tags were probably disambiguated manually. The tagset used in SzTB seems to be same or similar to [[http://nl.ijs.si/ME/V4/msd/html/msd-hu.html|Multext-East]]. In the CoNLL version, tags were decomposed into CPOS column, POS column and the list of feature-value pairs in the FEAT column. |
| |
Multi-word expressions have been collapsed into one token, using underscore as the joining character (e.g. Espainia_Poliziak, iduri_zait). | |
| |
==== Sample ==== | ==== Sample ==== |
The first sentence of the CoNLL 2007 training data: | The first sentence of the CoNLL 2007 training data: |
| |
| 1 | espainiako_poliziak | Espainia_Poliziak | IZE | IZE_LIB | PLU-<nowiki>|</nowiki>ENTI_LOC | 4 | ncsubj | _ | _ | | | 1 | Az | az | T | Tf | <nowiki>def=yes</nowiki> | 4 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 2 | hiru | hiru | DET | DET_DZH | NMGP | 3 | detmod | _ | _ | | | 2 | elmúlt | elmúlt | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 4 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 3 | gazte | gazte | IZE | IZE_ARR | ABS<nowiki>|</nowiki>MG | 4 | ncobj | _ | _ | | | 3 | nyolc | nyolc | M | Mc | <nowiki>n=singular|case=nominative</nowiki> | 4 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 4 | atxilotu | atxilotu | ADI | ADI_SIN | PART<nowiki>|</nowiki>BURU | 8 | lot | _ | _ | | | 4 | hónapban | hónap | N | Nc | <nowiki>n=singular|case=inessive|proper=no</nowiki> | 16 | INE | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 5 | ditu | *edun | ADL | ADL | A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>NK_HARK | 4 | auxmod | _ | _ | | | 5 | <nowiki>,</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 16 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 6 | atarrabian | Atarrabia | IZE | IZE_LIB | PLU-<nowiki>|</nowiki>INE<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | 4 | ncmod | _ | _ | | | 6 | amelyből | amely | P | Pr | <nowiki>p=3rd|n=singular|case=elative</nowiki> | 11 | ELA | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 7 | , | , | PUNC | PUNC_KOMA | _ | 6 | PUNC | _ | _ | | | 7 | összesen | összesen | R | Rx | <nowiki>_</nowiki> | 8 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 8 | eta | eta | LOT | LOT_JNT | - | 0 | ROOT | _ | _ | | | 8 | hatot | hat | M | Mc | <nowiki>n=singular|case=accusative</nowiki> | 11 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 9 | madrilera | Madril | IZE | IZE_LIB | PLU-<nowiki>|</nowiki>ALA<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | 10 | ncmod | _ | _ | | | 9 | kényszerűségből | kényszerűség | N | Nc | <nowiki>n=singular|case=elative|proper=no</nowiki> | 11 | ELA | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 10 | eraman | eraman | ADI | ADI_SIN | PART<nowiki>|</nowiki>BURU | 8 | lot | _ | _ | | | 10 | szabadságon | szabadság | N | Nc | <nowiki>n=singular|case=superessive|proper=no</nowiki> | 11 | SUP | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 11 | ditu | *edun | ADL | ADL | A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>NK_HARK | 10 | auxmod | _ | _ | | | 11 | töltött | tölt | V | Vm | <nowiki>mood=indicative|t=past|p=3rd|n=singular|def=no</nowiki> | 16 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 12 | . | . | PUNC | PUNC_PUNC | _ | 11 | PUNC | _ | _ | | | 12 | a | a | T | Tf | <nowiki>def=yes</nowiki> | 14 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 13 | parlamenti | parlamenti | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 14 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 14 | ellenzék | ellenzék | N | Nc | <nowiki>n=singular|case=nominative|proper=no</nowiki> | 11 | SUBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 15 | <nowiki>,</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 16 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 16 | megváltozott | megváltozik | V | Vm | <nowiki>mood=indicative|t=past|p=3rd|n=singular|def=no</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 17 | itthon | itthon | R | Rx | <nowiki>_</nowiki> | 16 | LOCY | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 18 | a | a | T | Tf | <nowiki>def=yes</nowiki> | 19 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 19 | hatalommegosztás | hatalommegosztás | N | Nc | <nowiki>n=singular|case=nominative|proper=no</nowiki> | 22 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 20 | <nowiki>1990-ben</nowiki> | 1990 | M | Mc | <nowiki>n=singular|case=inessive</nowiki> | 21 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 21 | kialakított | kialakított | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 22 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 22 | rendszere | rendszer | N | Nc | <nowiki>n=singular|case=nominative|proper=no|pperson=3rd|pnumber=singular</nowiki> | 16 | SUBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 23 | <nowiki>:</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 16 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 24 | az | az | T | Tf | <nowiki>def=yes</nowiki> | 26 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 25 | e | e | P | Pd | <nowiki>p=3rd|n=singular|case=nominative</nowiki> | 26 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 26 | héten | hét | N | Nc | <nowiki>n=singular|case=superessive|proper=no</nowiki> | 28 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 27 | audienciát | audiencia | N | Nc | <nowiki>n=singular|case=accusative|proper=no</nowiki> | 28 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 28 | tartó | tartó | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 29 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 29 | kormányfő | kormányfő | N | Nc | <nowiki>n=singular|case=nominative|proper=no</nowiki> | 31 | SUBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 30 | gyakorlatilag | gyakorlati | A | Af | <nowiki>deg=positive|n=singular|case=essive</nowiki> | 31 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 31 | kivonta | kivon | V | Vm | <nowiki>mood=indicative|t=past|p=3rd|n=singular|def=yes</nowiki> | 16 | CP | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 32 | magát | maga | P | Px | <nowiki>p=3rd|n=singular|case=accusative</nowiki> | 31 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 33 | az | az | T | Tf | <nowiki>def=yes</nowiki> | 34 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 34 | Országgyűlés | Országgyűlés | N | Np | <nowiki>n=singular|case=nominative|proper=yes</nowiki> | 35 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 35 | ellenőrzése | ellenőrzés | N | Nc | <nowiki>n=singular|case=nominative|proper=no|pperson=3rd|pnumber=singular</nowiki> | 36 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 36 | alól | alól | S | St | <nowiki>_</nowiki> | 31 | PP | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 37 | <nowiki>.</nowiki> | <nowiki>_</nowiki> | SPUNCT | SPUNCT | <nowiki>_</nowiki> | 16 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| |
The first sentence of the CoNLL 2007 test data: | The first sentence of the CoNLL 2007 test data: |
| |
| 1 | epaileek | epaile | IZE | IZE_ARR | BIZ+<nowiki>|</nowiki>ERG<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | | | 1 | A | a | T | Tf | <nowiki>def=yes</nowiki> | 2 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 2 | diote | esan | ADT | ADT | PNT<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HURA<nowiki>|</nowiki>NK_HAIEK-K | | | 2 | bankokkal | bank | N | Nc | <nowiki>n=plural|case=instrumental|proper=no</nowiki> | 4 | INS | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 3 | eaeko | EAE | IZE | IZE_LIB | SIG<nowiki>|</nowiki>GEL<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | | | 3 | kell | kell | V | Vm | <nowiki>mood=indicative|t=present|p=3rd|n=singular|def=no</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 4 | parlamentarioek | parlamentario | ADJ | ADJ_ARR | IZAUR-<nowiki>|</nowiki>ERG<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | | | 4 | egyezkedniük | egyezkedik | V | Vm | <nowiki>mood=infinitive|t=present|p=3rd|n=plural</nowiki> | 3 | INF | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 5 | eaetik_kanpo | EAE | SIG | SIG- | DEK<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>DEK<nowiki>|</nowiki>ABL_kanpo_ABS<nowiki>|</nowiki>ENTI_LOC<nowiki>|</nowiki>POS | | | 5 | azoknak | az | P | Pd | <nowiki>p=3rd|n=plural|case=dative</nowiki> | 8 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 6 | eginiko | egin | ADI | ADI_SIN | PART<nowiki>|</nowiki>GEL | | | 6 | a | a | T | Tf | <nowiki>def=yes</nowiki> | 8 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 7 | delituak | delitu | IZE | IZE_ARR | BIZ-<nowiki>|</nowiki>ABS<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | | | 7 | mezőgazdasági | mezőgazdasági | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 8 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 8 | ikertzea | ikertu | ADI | ADI_SIN | ADIZE<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>ABS | | | 8 | termelőknek | termelő | N | Nc | <nowiki>n=plural|case=dative|proper=no</nowiki> | 4 | DAT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 9 | eta | eta | LOT | LOT_JNT | - | | | 9 | <nowiki>,</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 3 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 10 | epaitzea | epaitu | ADI | ADI_SIN | ADIZE<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>ABS | | | 10 | akik | aki | P | Pr | <nowiki>p=3rd|n=plural|case=nominative</nowiki> | 21 | SUBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 11 | auzitegi_gorenari | auzitegi_gora | ADJ | ADJ_IZO | DEK<nowiki>|</nowiki>GEN<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>DEK<nowiki>|</nowiki>DAT<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM<nowiki>|</nowiki>ENTI_LOC | | | 11 | egy | egy | T | Ti | <nowiki>def=no</nowiki> | 19 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 12 | dagokiola | egon | ADT | ADT | PNT<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HURA<nowiki>|</nowiki>NI_HARI | | | 12 | <nowiki>,</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 19 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 13 | , | , | PUNC | PUNC_KOMA | _ | | | 13 | a | a | T | Tf | <nowiki>def=yes</nowiki> | 15 | DET | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 14 | baina | baina | LOT | LOT_JNT | AURK | | | 14 | múlt | múlt | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 15 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 15 | atzerrian | atzerri | IZE | IZE_ARR | INE<nowiki>|</nowiki>NUMS<nowiki>|</nowiki>MUGM | | | 15 | héten | hét | N | Nc | <nowiki>n=singular|case=superessive|proper=no</nowiki> | 16 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 16 | izaniko | izan | ADI | ADI_SIN | PART<nowiki>|</nowiki>GEL | | | 16 | megjelent | megjelent | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 19 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 17 | kontaktu | kontaktu | IZE | IZE_ARR | _ | | | 17 | földművelésügyi | földművelésügyi | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 18 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 18 | horiek | horiek | DET | DET_ERKARR | ABS<nowiki>|</nowiki>NUMP<nowiki>|</nowiki>MUGM | | | 18 | minisztériumi | minisztériumi | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 19 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 19 | ezin_direla | ezin_izan | ADI | ADI_ADK | PNT<nowiki>|</nowiki>KONPL<nowiki>|</nowiki>A1<nowiki>|</nowiki>NR_HAIEK<nowiki>|</nowiki>MWCorrect | | | 19 | rendelet | rendelet | N | Nc | <nowiki>n=singular|case=nominative|proper=no</nowiki> | 20 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 20 | delitutzat | delitu | IZE | IZE_ARR | BIZ-<nowiki>|</nowiki>PRO<nowiki>|</nowiki>MG | | | 20 | alapján | alap | N | Nc | <nowiki>n=singular|case=superessive|proper=no|pperson=3rd|pnumber=singular</nowiki> | 21 | SUP | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 21 | hartu | hartu | ADI | ADI_SIN | PART | | | 21 | kérik | kér | V | Vm | <nowiki>mood=indicative|t=present|p=3rd|n=plural|def=yes</nowiki> | 5 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 22 | . | . | PUNC | PUNC_PUNC | _ | | | 22 | ősszel | ősszel | R | Rx | <nowiki>_</nowiki> | 23 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 23 | lejáró | lejáró | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 27 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
The first sentence of the BDT-II training data: | | 24 | <nowiki>,</nowiki> | <nowiki>_</nowiki> | WPUNCT | WPUNCT | <nowiki>_</nowiki> | 27 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 25 | éven | év | N | Nc | <nowiki>n=singular|case=superessive|proper=no</nowiki> | 26 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 1 | Estatu_Batuetako_DEAko | Estatu_Batuak_DEA | IZE | LIB | PLU:+<nowiki>|</nowiki>IZAUR:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>MW:B<nowiki>|</nowiki>ENT:Erakundea | 2 | ncmod | _ | _ | | | 26 | belüli | belüli | A | Af | <nowiki>deg=positive|n=singular|case=nominative</nowiki> | 27 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 2 | buru | buru | IZE | ARR | _ | 4 | ncsubj | _ | _ | | | 27 | hiteleik | hitel | N | Nc | <nowiki>n=plural|case=nominative|proper=no|pperson=3rd|pnumber=plural</nowiki> | 28 | ATT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 3 | ohiak | ohi | ADJ | ARR | IZAUR:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | | | 28 | átütemezését | átütemezés | N | Nc | <nowiki>n=singular|case=accusative|proper=no|pperson=3rd|pnumber=singular</nowiki> | 21 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 4 | aztertuko | aztertu | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:GERO | 0 | ROOT | _ | _ | | | 29 | <nowiki>.</nowiki> | <nowiki>_</nowiki> | SPUNCT | SPUNCT | <nowiki>_</nowiki> | 3 | PUNCT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 5 | du | *edun | ADL | ADL | MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 4 | auxmod | _ | _ | | |
| 6 | RUCen | RUC | IZE | IZB | MTKAT:SIG<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Erakundea | 7 | ncmod | _ | _ | | |
| 7 | erreforma | erreforma | IZE | ARR | KAS:ABS<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 4 | ncobj | _ | _ | | |
| 8 | . | . | PUNT_MARKA | PUNT_PUNT | _ | 7 | PUNC | _ | _ | | |
| |
The first sentence of the BDT-II development data: | |
| |
| 1 | Irakaskuntzan | irakaskuntza | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | | |
| 2 | jardun | jardun | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:BURU | 0 | ROOT | _ | _ | | |
| 3 | zuen | *edun | ADL | ADL | MDN:B1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 2 | auxmod | _ | _ | | |
| 4 | Miel | Miel | IZE | IZB | PLU:-<nowiki>|</nowiki>ENT:Pertsona | 5 | entios | _ | _ | | |
| 5 | Anjel_Elustondok | Anjel_Elustondo | IZE | IZB | PLU:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Pertsona | 2 | ncsubj | _ | _ | | |
| 6 | 1980 | 1980 | IZE | ZKI | _ | 7 | ncmod | _ | _ | | |
| 7 | urtetik | urte | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:ABL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | | |
| 8 | 1992ra | 1992 | IZE | ZKI | KAS:ALA<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | | |
| 9 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 8 | PUNC | _ | _ | | |
| 10 | hauetatik | hauek | DET | ERKARR | KAS:ABL<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | | |
| 11 | hamar | hamar | DET | DZH | NMG:P | 12 | detmod | _ | _ | | |
| 12 | urtez | urte | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INS<nowiki>|</nowiki>MUG:MG | 16 | lot | _ | _ | | |
| 13 | Azpeitiko | Azpeitia | IZE | LIB | PLU:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 14 | ncmod | _ | _ | | |
| 14 | ikastolan | ikastola | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | | |
| 15 | irakasle | irakasle | IZE | ARR | KAS:ABS<nowiki>|</nowiki>MUG:MG | 16 | ncpred | _ | _ | | |
| 16 | eta | eta | LOT | JNT | ERL:EMEN | 8 | aponcmod | _ | _ | | |
| 17 | beste | beste | DET | DZG | _ | 18 | detmod | _ | _ | | |
| 18 | biak | bi | IZE | ZKI | KAS:ABS<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 16 | lot | _ | _ | | |
| 19 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 18 | PUNC | _ | _ | | |
| 20 | Arabako | Araba | IZE | LIB | PLU:-<nowiki>|</nowiki>KAS:GEL<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 21 | ncmod | _ | _ | | |
| 21 | ikastolen | ikastola | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:P<nowiki>|</nowiki>MUG:M | 22 | ncmod | _ | _ | | |
| 22 | elkartean | elkarte | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | | |
| 23 | . | . | PUNT_MARKA | PUNT_PUNT | _ | 22 | PUNC | _ | _ | | |
| |
The first sentence of the BDT-II test data: | |
| |
| 1 | Hegoaldean | hegoalde | IZE | ARR | KAS:INE<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 2 | ncmod | _ | _ | | |
| 2 | iduri_zait | iduri_izan | ADI | ADK | ASP:PNT<nowiki>|</nowiki>MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORI:NIRI<nowiki>|</nowiki>MW:B | 0 | ROOT | _ | _ | | |
| 3 | euskararen | euskara | IZE | ARR | BIZ:-<nowiki>|</nowiki>KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 4 | ncmod | _ | _ | | |
| 4 | mundu | mundu | IZE | ARR | BIZ:- | 7 | ncsubj | _ | _ | | |
| 5 | hau | hau | DET | ERKARR | KAS:ABS<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 4 | detmod | _ | _ | | |
| 6 | adi-adi | adi-adi | ADB | ARR | _ | 7 | ncmod | _ | _ | | |
| 7 | dagola | egon | ADT | ADT | ASP:PNT<nowiki>|</nowiki>ERL:KONPL<nowiki>|</nowiki>MDN:A3<nowiki>|</nowiki>NOR:HURA | 2 | ccomp_obj | _ | _ | | |
| 8 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 7 | PUNC | _ | _ | | |
| 9 | Euskaltzaindiak | Euskaltzaindia | IZE | LIB | PLU:-<nowiki>|</nowiki>KAS:ERG<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M<nowiki>|</nowiki>ENT:Tokia | 11 | ncsubj | _ | _ | | |
| 10 | zer | zer | DET | NOLGAL | NMG:MG<nowiki>|</nowiki>KAS:ABS<nowiki>|</nowiki>MUG:MG | 11 | ncobj | _ | _ | | |
| 11 | erranen | erran | ADI | SIN | ADM:PART<nowiki>|</nowiki>ASP:GERO | 13 | menos | _ | _ | | |
| 12 | duen | *edun | ADL | ADL | ERL:ZHG<nowiki>|</nowiki>MDN:A1<nowiki>|</nowiki>NOR:HURA<nowiki>|</nowiki>NORK:HARK | 11 | auxmod | _ | _ | | |
| 13 | zain | zain | ADB | ARR | _ | 7 | cmod | _ | _ | | |
| 14 | , | , | PUNT_MARKA | PUNT_KOMA | _ | 13 | PUNC | _ | _ | | |
| 15 | haren | hura | DET | ERKARR | KAS:GEN<nowiki>|</nowiki>NUM:S<nowiki>|</nowiki>MUG:M | 16 | ncmod | _ | _ | | |
| 16 | arauen | arau | IZE | ARR | KAS:ABS<nowiki>|</nowiki>MUG:MG | 18 | ncmod | _ | _ | | |
| 17 | berehala | berehala | ADB | ARR | _ | 18 | ncmod | _ | _ | | |
| 18 | betetzeko | bete | ADI | SIN | ADM:ADIZE<nowiki>|</nowiki>ERL:HELB<nowiki>|</nowiki>KAS:ABS<nowiki>|</nowiki>MUG:MG | 7 | xmod | _ | _ | | |
| 19 | . | . | PUNT_MARKA | PUNT_PUNT | _ | 18 | PUNC | _ | _ | | |
| |
==== Parsing ==== | ==== Parsing ==== |