Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
user:zeman:treebanks:hr [2014/07/17 21:23] zeman Sample. |
user:zeman:treebanks:hr [2014/07/17 21:27] zeman Finalizing the page. |
||
---|---|---|---|
Line 38: | Line 38: | ||
The improved pre-release version contains 83640 tokens in 3736 sentences, yielding 22.39 tokens per sentence on average. | The improved pre-release version contains 83640 tokens in 3736 sentences, yielding 22.39 tokens per sentence on average. | ||
+ | |||
+ | There is no official training-test division of the original data. For HamleDT, we have split the data 90:10, i.e. the first 3362 sentences (75236 tokens) for training and the remaining 374 sentences (8404 tokens) for testing. | ||
==== Inside ==== | ==== Inside ==== | ||
Line 70: | Line 72: | ||
(The sum of the percentages exceeds 100% because of rounding.) | (The sum of the percentages exceeds 100% because of rounding.) | ||
- | ==== XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ==== | ||
==== Sample ==== | ==== Sample ==== | ||
Line 105: | Line 106: | ||
==== Parsing ==== | ==== Parsing ==== | ||
- | Nonprojectivities in BTB are rare. Only 747 of the 196, | + | Nonprojectivities in SETimes.HR |
- | + | ||
- | The results of the CoNLL 2006 shared task are [[http:// | + | |
- | + | ||
- | ^ Parser (Authors) ^ LAS ^ UAS ^ | + | |
- | | MST (McDonald et al.) | 87.57 | 92.04 | | + | |
- | | Malt (Nivre et al.) | 87.41 | 91.72 | | + | |
- | | Nara (Yuchang Cheng) | 86.34 | 91.30 | | + | |
+ | //Are there any published parsing results on this corpus?// |