Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
user:zeman:treebanks:hu [2011/12/13 13:20] zeman Sample. |
user:zeman:treebanks:hu [2011/12/13 22:52] (current) zeman Personal names. |
||
---|---|---|---|
Line 54: | Line 54: | ||
==== Inside ==== | ==== Inside ==== | ||
- | Both versions (CoNLL 2007 and BDT-II) are in the CoNLL 2006/2007 format. | + | The original Szeged Treebank is a phrase-based treebank and it is distributed in XML-based, TEI-compliant format. The CoNLL 2007 version is dependency-based (i.e. the head of each phrase was identified), distributed |
- | The syntactic guidelines (structure and labels) are described in Spanish | + | Morphological annotation includes lemmas. Morphosyntactic tags were probably disambiguated manually. |
- | Multi-word expressions | + | Personal names have been collapsed into one token, using underscore as the joining character (e.g. Torgyán_József). |
==== Sample ==== | ==== Sample ==== | ||
Line 136: | Line 136: | ||
==== Parsing ==== | ==== Parsing ==== | ||
- | BDT is a mildly nonprojective treebank. | + | SzTB is a mildly nonprojective treebank. |
- | The results of the CoNLL 2007 shared task are [[http:// | + | The results of the CoNLL 2007 shared task are [[http:// |
^ Parser (Authors) ^ LAS ^ UAS ^ | ^ Parser (Authors) ^ LAS ^ UAS ^ | ||
- | | Malt (Nilsson et al.) | 76.94 | 82.84 | | + | | Malt (Nilsson et al.) | 80.27 | 83.55 | |
- | | Titov et al. | 75.49 | 81.93 | | + | | Sagae | 79.53 | 83.51 | |
- | | Sagae | 74.64 | 81.19 | | + | | Nakagawa | 76.74 | 82.49 | |
- | | Carreras | 75.75 | 81.11 | | + | | Titov et al. | 77.94 | 82.18 | |
- | | Nakagawa | 72.56 | 81.04 | | + | |
- | | Malt (J. Hall et al.) | 74.99 | 80.61 | | + | |
- | | Johansson | + | |
The two Malt parser results of 2007 (single malt and blended) are described in [[http:// | The two Malt parser results of 2007 (single malt and blended) are described in [[http:// | ||
- | Parsing results on BDT-II have been published in Kepa Bengoetxea, Koldo Gojenola: [[http:// |