Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
user:zeman:treebanks:it [2012/01/03 15:38] zeman Sample. |
user:zeman:treebanks:it [2012/01/03 15:48] (current) zeman Parsing results. |
||
---|---|---|---|
Line 42: | Line 42: | ||
==== Inside ==== | ==== Inside ==== | ||
- | The original | + | The original |
- | Morphological annotation includes lemmas. Morphosyntactic tags were probably disambiguated manually. The tagset used in SzTB seems to be same or similar to [[http:// | + | Morphological annotation includes lemmas. Morphosyntactic tags were probably disambiguated manually. In the CoNLL version, tags were decomposed into CPOS column, POS column and the list of feature-value pairs in the FEAT column. |
- | Personal names have been collapsed into one token, using underscore as the joining character (e.g. Torgyán_József). | + | Multi-word expressions |
==== Sample ==== | ==== Sample ==== | ||
Line 89: | Line 89: | ||
==== Parsing ==== | ==== Parsing ==== | ||
- | SzTB is a mildly nonprojective treebank. 4032 of the 139, | + | Nonprojectivities in ISST-CoNLL are rare. 354 of the 76295 tokens of the CoNLL 2007 version are attached nonprojectively (0.46%). |
- | The results of the CoNLL 2007 shared task are [[http:// | + | The results of the CoNLL 2007 shared task are [[http:// |
^ Parser (Authors) ^ LAS ^ UAS ^ | ^ Parser (Authors) ^ LAS ^ UAS ^ | ||
- | | Malt (Nilsson et al.) | 80.27 | 83.55 | | + | | Nakagawa | 83.61 | 87.91 | |
- | | Sagae | 79.53 | 83.51 | | + | | Malt (Nilsson et al.) | 84.40 | 87.77 | |
- | | Nakagawa | 76.74 | 82.49 | | + | | Sagae | 83.91 | 87.68 | |
- | | Titov et al. | 77.94 | 82.18 | | + | | Carreras |
The two Malt parser results of 2007 (single malt and blended) are described in [[http:// | The two Malt parser results of 2007 (single malt and blended) are described in [[http:// | ||