Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
user:zeman:treebanks:nl [2012/01/10 10:57] zeman Notes. |
user:zeman:treebanks:nl [2012/01/11 11:32] (current) zeman Typo. |
* Leonoor van der Beek, Gosse Bouma, Jan Daciuk, Tanja Gaustad, Robert Malouf, Gertjan van Noord, Robbert Prins, Begoña Villada: [[http://odur.let.rug.nl/~vannoord/trees/Papers/report_ch5.pdf|Algorithms for Linguistic Processing NWO PIONIER Progress Report]]. Groningen, Netherlands, 2002. | * Leonoor van der Beek, Gosse Bouma, Jan Daciuk, Tanja Gaustad, Robert Malouf, Gertjan van Noord, Robbert Prins, Begoña Villada: [[http://odur.let.rug.nl/~vannoord/trees/Papers/report_ch5.pdf|Algorithms for Linguistic Processing NWO PIONIER Progress Report]]. Groningen, Netherlands, 2002. |
* Documentation | * Documentation |
* The files ''doc/tagset.txt'', ''doc/syn_prot.pdf'' and ''doc/diffs.pdf'' in the CoNLL 2006 distribution. | * The files {{:user:zeman:treebanks:nl-tagset.txt|doc/tagset.txt}}, ''doc/syn_prot.pdf'' and ''doc/diffs.pdf'' in the CoNLL 2006 distribution. |
| |
==== Domain ==== | ==== Domain ==== |
| |
full cdbl (newspaper) part of the Eindhoven corpus | Newspaper. The Alpino Treebank consists of “the full cdbl (newspaper) part of the Eindhoven corpus.” |
| |
Unknown (the underlying PAROLE corpus “consists of quotations of 150-250 words from a wide range of randomly selected linguistically representative Danish texts from 1983-1992.”) | |
| |
==== Size ==== | ==== Size ==== |
| |
The CoNLL 2006 version contains 100,238 tokens in 5512 sentences, yielding 18.19 tokens per sentence on average (CoNLL 2006 data split: 94386 tokens / 5190 sentences training, 5852 tokens / 322 sentences test). | The CoNLL 2006 version contains 200,654 tokens in 13735 sentences, yielding 14.61 tokens per sentence on average (CoNLL 2006 data split: 195,069 tokens / 13349 sentences training, 5585 tokens / 386 sentences test). |
| |
==== Inside ==== | ==== Inside ==== |
| |
CoNLL Alpino: The orginal POS tags from the Alpino Treebank were replaced by POS | In the CoNLL version, the original POS tags from the Alpino Treebank were replaced by POS tags from the Memory-based part-of-speech tagger using the WOTAN tagset, which is described in the file ''tagset.txt''. The morphological annotation includes lemmas. The syntactic annotation is mostly identical to that of the Corpus Gesproken Nederlands (CGN, Spoken Dutch Corpus) as described in the file ''syn_prot.pdf'' (Dutch only). An attempt to describe a number of differences between the CGN and Alpino annotation practice is given in the file ''diff.pdf'' (which is heavily out of date, but the number of differences has been reduced). Conversion issues: head selection, multi-word units, discourse units. |
tags from the Memory-based part-of-speech tagger using the WOTAN | |
tagset, which is described in the file tagset.txt | |
The syntactic annotation is mostly identical to that of the Corpus | |
Gesproken Nederlands (CGN, Spoken Dutch Corpus) as described in the | |
file syn_prot.pdf (Dutch only). An attempt to describe a number of | |
differences between the CGN and Alpino annotation practice is given in | |
the file diff.pdf (which is heavily out of date, but the number of | |
differences has been reduced heavily recently.) | |
3.6 Conversion | |
| |
Issues: | Multi-word expressions have been concatenated into one token, using underscore as the joining character (e.g. "Economische_en_Monetaire_Unie"). They have special part-of-speech tags ''MWU'', their subparts of speech and features may describe the individual parts of the unit. E.g. "aan_het" has CPOS ''MWU'', (sub)POS ''Prep_Art'' and features ''voor_bep|onzijd|neut''. |
- head selection | |
- multi-word units | |
- discourse units | |
| |
| |
The original morphosyntactic tags have been converted to fit into the three columns (CPOS, POS and FEAT) of the CoNLL format. There //should// be a 1-1 mapping between the [[http://www.buch-kromann.dk/matthias/treebank/PAROLE-manual.pdf|DDT positional tags]] and the CoNLL 2006 annotation. Use [[http://quest.ms.mff.cuni.cz/cgi-bin/interset/index.pl?tagset=da::conll|DZ Interset]] to inspect the CoNLL tagset. | |
| |
The morphological analysis in the CoNLL 2006 version does not include lemmas (the original DTAG version does contain them). The morphosyntactic tags have been assigned (probably) manually. | |
| |
Some multi-word expressions have been collapsed into one token, using underscore as the joining character. This includes adverbially used prepositional phrases (e.g. i_lørdags = on Saturdays) but not named entities. | |
| |
==== Sample ==== | ==== Sample ==== |
| |
The first sentence of DDT 1.0 in the DTAG format: | The first two sentences of the CoNLL 2006 training data: |
| |
<code xml><tei.2> | |
<teiHeader type=text> | |
<fileDesc> | |
<titleStmt> | |
<title>Tagged sample of: 'Jeltsins skæbnetime'</title> | |
</titleStmt> | |
<extent words=158>158 running words</extent> | |
<publicationStmt> | |
<distributor>PAROLE-DK</distributor> | |
<address><addrline>Christians Brygge 1,1., DK-1219 Copenhagen K.</address> | |
<date>1998-06-02</date> | |
<availability status=restricted><p>by agreement with distributor</availability> | |
</publicationStmt> | |
<sourceDesc> | |
<biblStruct> | |
<analytic> | |
<title>Jeltsins skæbnetime</title> | |
<author gender=m born=1925>Nikulin, Leon</author> | |
</analytic> | |
<monogr> | |
<imprint><pubPlace>Denmark</pubPlace> | |
<publisher>Det Fri Aktuelt</publisher> | |
<date>1992-12-01</date> | |
</imprint> | |
</monogr> | |
</biblStruct> | |
</sourceDesc> | |
</fileDesc> | |
<profileDesc> | |
<creation>1992-12-01</creation> | |
<langUsage><language>Danish</langUsage> | |
<textClass> | |
<catRef target="P.M2"> | |
<catRef target="P.G4.8"> | |
<catRef target="P.T9.3"> | |
</textClass> | |
</profileDesc> | |
</teiHeader> | |
<text id=AJK> | |
<body> | |
<div1 type=main> | |
<p> | |
<s> | |
<W lemma="to" msd="AC---U=--" in="9:subj" out="1:mod|2:mod|3:nobj|5:appr">To</W> | |
<W lemma="kendt" msd="ANP[CN]PU=[DI]U" in="-1:mod" out="">kendte</W> | |
<W lemma="russisk" msd="ANP[CN]PU=[DI]U" in="-2:mod" out="">russiske</W> | |
<W lemma="historiker" msd="NCCPU==I" in="-3:nobj" out="">historikere</W> | |
<W lemma="Andronik" msd="NP--U==-" in="1:namef" out="">Andronik</W> | |
<W lemma="Mirganjan" msd="NP--U==-" in="-5:appr" out="-1:namef|1:coord">Mirganjan</W> | |
<W lemma="og" msd="CC" in="-1:coord" out="2:conj">og</W> | |
<W lemma="Igor" msd="NP--U==-" in="1:namef" out="">Igor</W> | |
<W lemma="Klamkin" msd="NP--U==-" in="-2:conj" out="-1:namef">Klamkin</W> | |
<W lemma="tro" msd="VADR=----A-" in="" out="-9:subj|1:mod|2:pnct|3:dobj|12:pnct">tror</W> | |
<W lemma="ikke" msd="RGU" in="-1:mod" out="">ikke</W> | |
<W lemma="," msd="XP" in="-2:pnct" out="">,</W> | |
<W lemma="at" msd="CS" in="-3:dobj" out="2:vobj">at</W> | |
<W lemma="Rusland" msd="NP--U==-" in="1:subj|2:[subj]" out="">Rusland</W> | |
<W lemma="kunne" msd="VADR=----A-" in="-2:vobj" out="-1:subj|1:vobj|2:mod">kan</W> | |
<W lemma="udvikle" msd="VAF-=----P-" in="-1:vobj" out="-2:[subj]">udvikles</W> | |
<W lemma="uden" msd="SP" in="-2:mod" out="1:nobj">uden</W> | |
<W lemma="en" msd="PI-CSU--U" in="-1:nobj" out="2:nobj">en</W> | |
<W lemma=""" msd="XP" in="1:pnct" out="">"</W> | |
<W lemma="jernnæve" msd="NCCSU==I" in="-2:nobj" out="-1:pnct|1:pnct">jernnæve</W> | |
<W lemma=""" msd="XP" in="-1:pnct" out="">"</W> | |
<W lemma="." msd="XP" in="-12:pnct" out="">.</W> | |
</s></code> | |
| |
The first sentence of the CoNLL 2006 training data: | |
| |
| 1 | Samme | _ | A | AN | degree=pos<nowiki>|</nowiki>gender=common/neuter<nowiki>|</nowiki>number=sing/plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=def/indef<nowiki>|</nowiki>transcat=unmarked | 0 | ROOT | _ | _ | | | 1 | Cathy | Cathy | N | N | <nowiki>eigen|ev|neut</nowiki> | 2 | su | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 2 | cifre | _ | N | NC | gender=neuter<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=indef | 1 | nobj | _ | _ | | | 2 | zag | zie | V | V | <nowiki>trans|ovt|1of2of3|ev</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 3 | , | _ | X | XP | _ | 1 | pnct | _ | _ | | | 3 | hen | hen | Pron | Pron | <nowiki>per|3|mv|datofacc</nowiki> | 2 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 4 | de | _ | P | PD | gender=common/neuter<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>register=unmarked | 7 | subj | _ | _ | | | 4 | wild | wild | Adj | Adj | <nowiki>attr|stell|onverv</nowiki> | 5 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 5 | norske | _ | A | AN | degree=pos<nowiki>|</nowiki>gender=common/neuter<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=def/indef<nowiki>|</nowiki>transcat=unmarked | 4 | mod | _ | _ | | | 5 | zwaaien | zwaai | N | N | <nowiki>soort|mv|neut</nowiki> | 2 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 6 | piger | _ | N | NC | gender=common<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=indef | 4 | nobj | _ | _ | | | 6 | <nowiki>.</nowiki> | <nowiki>.</nowiki> | Punc | Punc | punt | 5 | punct | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 7 | tabte | _ | V | VA | mood=indic<nowiki>|</nowiki>tense=past<nowiki>|</nowiki>voice=active | 1 | rel | _ | _ | | | |||||||||| |
| 8 | med | _ | SP | SP | _ | 7 | pobj | _ | _ | | | 1 | Ze | ze | Pron | Pron | <nowiki>per|3|evofmv|nom</nowiki> | 2 | su | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 9 | i_lørdags | _ | RG | RG | degree=unmarked | 7 | mod | _ | _ | | | 2 | had | heb | V | V | <nowiki>trans|ovt|1of2of3|ev</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 10 | mod | _ | SP | SP | _ | 7 | pobj | _ | _ | | | 3 | met | met | Prep | Prep | voor | 8 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 11 | VMs | _ | N | NP | case=gen | 10 | nobj | _ | _ | | | 4 | haar | haar | Pron | Pron | <nowiki>bez|3|ev|neut|attr</nowiki> | 5 | det | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 12 | værtsnation | _ | N | NC | gender=common<nowiki>|</nowiki>number=sing<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=indef | 11 | possd | _ | _ | | | 5 | moeder | moeder | N | N | <nowiki>soort|ev|neut</nowiki> | 3 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 13 | . | _ | X | XP | _ | 1 | pnct | _ | _ | | | 6 | kunnen | kan | V | V | <nowiki>hulp|ott|1of2of3|mv</nowiki> | 2 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 7 | gaan | ga | V | V | <nowiki>hulp|inf</nowiki> | 6 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 8 | winkelen | winkel | V | V | <nowiki>intrans|inf</nowiki> | 11 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 9 | <nowiki>,</nowiki> | <nowiki>,</nowiki> | Punc | Punc | komma | 8 | punct | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 10 | zwemmen | zwem | V | V | <nowiki>intrans|inf</nowiki> | 11 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 11 | of | of | Conj | Conj | neven | 7 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 12 | terrassen | terras | N | N | <nowiki>soort|mv|neut</nowiki> | 11 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 13 | <nowiki>.</nowiki> | <nowiki>.</nowiki> | Punc | Punc | punt | 12 | punct | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| |
The first sentence of the CoNLL 2006 test data: | The first two sentences of the CoNLL 2006 test data: |
| |
| 1 | To | _ | A | AC | case=unmarked | 10 | subj | _ | _ | | | 1 | BASISTAKENPAKKET | <nowiki>basis_taken_pakket</nowiki> | Prep | Prep | voor | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 2 | kendte | _ | A | AN | degree=pos<nowiki>|</nowiki>gender=common/neuter<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=def/indef<nowiki>|</nowiki>transcat=unmarked | 1 | mod | _ | _ | | | 2 | JEUGDGEZONDHEIDSZORG | <nowiki>jeugd_gezondheid_zorg</nowiki> | N | N | <nowiki>eigen|ev|neut</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 3 | russiske | _ | A | AN | degree=pos<nowiki>|</nowiki>gender=common/neuter<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=def/indef<nowiki>|</nowiki>transcat=unmarked | 1 | mod | _ | _ | | | 3 | <nowiki>0-19</nowiki> | <nowiki>0-19</nowiki> | Num | Num | <nowiki>hoofd|bep|attr|onverv</nowiki> | 4 | det | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 4 | historikere | _ | N | NC | gender=common<nowiki>|</nowiki>number=plur<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=indef | 1 | nobj | _ | _ | | | 4 | JAAR | JAAR | N | N | <nowiki>eigen|ev|neut</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 5 | Andronik | _ | N | NP | case=unmarked | 6 | namef | _ | _ | | | |||||||||| |
| 6 | Mirganjan | _ | N | NP | case=unmarked | 1 | appr | _ | _ | | | 1 | Daarvoor | daarvoor | Adv | Adv | <nowiki>pron|aanw</nowiki> | 3 | pc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 7 | og | _ | C | CC | _ | 6 | coord | _ | _ | | | 2 | is | ben | V | V | <nowiki>hulpofkopp|ott|3|ev</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 8 | Igor | _ | N | NP | case=unmarked | 9 | namef | _ | _ | | | 3 | gekozen | kies | V | V | <nowiki>trans|verldw|onverv</nowiki> | 2 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 9 | Klamkin | _ | N | NP | case=unmarked | 7 | conj | _ | _ | | | 4 | omdat | omdat | Conj | Conj | <nowiki>onder|metfin</nowiki> | 3 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 10 | tror | _ | V | VA | mood=indic<nowiki>|</nowiki>tense=present<nowiki>|</nowiki>voice=active | 0 | ROOT | _ | _ | | | 5 | gemeenten | gemeente | N | N | <nowiki>soort|mv|neut</nowiki> | 11 | su | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 11 | ikke | _ | RG | RG | degree=unmarked | 10 | mod | _ | _ | | | 6 | bij | bij | Prep | Prep | voor | 12 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 12 | , | _ | X | XP | _ | 10 | pnct | _ | _ | | | 7 | uitstek | uitstek | N | N | <nowiki>soort|ev|neut</nowiki> | 6 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 13 | at | _ | C | CS | _ | 10 | dobj | _ | _ | | | 8 | het | het | Art | Art | <nowiki>bep|onzijd|neut</nowiki> | 10 | det | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 14 | Rusland | _ | N | NP | case=unmarked | 15 | subj | _ | _ | | | 9 | lokale | lokaal | Adj | Adj | <nowiki>attr|stell|vervneut</nowiki> | 10 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 15 | kan | _ | V | VA | mood=indic<nowiki>|</nowiki>tense=present<nowiki>|</nowiki>voice=active | 13 | vobj | _ | _ | | | 10 | gezondheidsbeleid | <nowiki>gezondheid_beleid</nowiki> | N | N | <nowiki>soort|ev|neut</nowiki> | 12 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 16 | udvikles | _ | V | VA | mood=infin<nowiki>|</nowiki>voice=passive | 15 | vobj | _ | _ | | | 11 | kunnen | kan | V | V | <nowiki>hulp|inf</nowiki> | 4 | body | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 17 | uden | _ | SP | SP | _ | 15 | mod | _ | _ | | | 12 | toespitsen | <nowiki>spits_toe</nowiki> | V | V | <nowiki>refl|inf</nowiki> | 11 | vc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 18 | en | _ | P | PI | gender=common<nowiki>|</nowiki>number=sing<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>register=unmarked | 17 | nobj | _ | _ | | | 13 | op | op | Prep | Prep | voor | 12 | pc | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 19 | " | _ | X | XP | _ | 20 | pnct | _ | _ | | | 14 | de | de | Art | Art | <nowiki>bep|zijdofmv|neut</nowiki> | 16 | det | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 20 | jernnæve | _ | N | NC | gender=common<nowiki>|</nowiki>number=sing<nowiki>|</nowiki>case=unmarked<nowiki>|</nowiki>def=indef | 18 | nobj | _ | _ | | | 15 | specifieke | specifiek | Adj | Adj | <nowiki>attr|stell|vervneut</nowiki> | 16 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 21 | " | _ | X | XP | _ | 20 | pnct | _ | _ | | | 16 | gezondheidssituatie | <nowiki>gezondheid_situatie</nowiki> | N | N | <nowiki>soort|ev|neut</nowiki> | 17 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| 22 | . | _ | X | XP | _ | 10 | pnct | _ | _ | | | 17 | en | en | Conj | Conj | neven | 13 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 18 | zorgbehoeften | <nowiki>zorg_behoefte</nowiki> | N | N | <nowiki>soort|mv|neut</nowiki> | 17 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 19 | van | van | Prep | Prep | voor | 16 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 20 | kinderen | kind | N | N | <nowiki>soort|mv|neut</nowiki> | 21 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 21 | en | en | Conj | Conj | neven | 19 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 22 | jongeren | jongere | Adj | Adj | <nowiki>zelfst|vergr|vervneut</nowiki> | 21 | cnj | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 23 | in | in | Prep | Prep | voor | 20 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 24 | de | de | Art | Art | <nowiki>bep|zijdofmv|neut</nowiki> | 26 | det | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 25 | eigen | eigen | Pron | Pron | <nowiki>aanw|neut|attr|weigen</nowiki> | 26 | mod | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 26 | gemeente | gemeente | N | N | <nowiki>soort|ev|neut</nowiki> | 23 | obj1 | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| | 27 | <nowiki>.</nowiki> | <nowiki>.</nowiki> | Punc | Punc | punt | 26 | punct | <nowiki>_</nowiki> | <nowiki>_</nowiki> | |
| |
==== Parsing ==== | ==== Parsing ==== |
| |
Nonprojectivities in DDT are not frequent. Only 988 of the 100,238 tokens in the CoNLL 2006 version are attached nonprojectively (0.99%). | Nonprojectivities in Alpino are quite frequent. 10858 of the 200,654 tokens in the CoNLL 2006 version are attached nonprojectively (5.41%). |
| |
The results of the CoNLL 2006 shared task are [[http://ilk.uvt.nl/conll/results.html|available online]]. They have been published in [[http://aclweb.org/anthology-new/W/W06/W06-2920.pdf|(Buchholz and Marsi, 2006)]]. The evaluation procedure was non-standard because it excluded punctuation tokens. These are the best results for Danish: | The results of the CoNLL 2006 shared task are [[http://ilk.uvt.nl/conll/results.html|available online]]. They have been published in [[http://aclweb.org/anthology-new/W/W06/W06-2920.pdf|(Buchholz and Marsi, 2006)]]. The evaluation procedure was non-standard because it excluded punctuation tokens. These are the best results for Dutch: |
| |
^ Parser (Authors) ^ LAS ^ UAS ^ | ^ Parser (Authors) ^ LAS ^ UAS ^ |
| MST (McDonald et al.) | 84.79 | 90.58 | | | MST (McDonald et al.) | 79.19 | 83.57 | |
| Malt (Nivre et al.) | 84.77 | 89.80 | | | Riedel et al. | 78.59 | 82.91 | |
| Riedel et al. | 83.63 | 89.66 | | | Basis (John O'Neil) | 77.51 | 81.73 | |
| | Malt (Nivre et al.) | 78.59 | 81.35 | |
| |