Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
user:zeman:treebanks:es [2011/11/20 21:44] zeman Spanish domain and size. |
user:zeman:treebanks:es [2011/11/20 21:45] zeman |
||
---|---|---|---|
Line 50: | Line 50: | ||
The CoNLL 2006 version contains 95028 tokens in 3512 sentences, yielding 27.06 tokens per sentence on average (CoNLL 2006 data split: 89334 tokens / 3306 sentences training, 5694 tokens / 206 sentences test). | The CoNLL 2006 version contains 95028 tokens in 3512 sentences, yielding 27.06 tokens per sentence on average (CoNLL 2006 data split: 89334 tokens / 3306 sentences training, 5694 tokens / 206 sentences test). | ||
- | The CoNLL 2009 version contains 528,440 tokens in 17709 sentences, yielding 29.59 tokens per sentence on average (CoNLL 2009 data split: 427,442 tokens / 14329 sentences training, 50368 tokens / 1655 sentences development, | + | The CoNLL 2009 version contains 528,440 tokens in 17709 sentences, yielding 29.84 tokens per sentence on average (CoNLL 2009 data split: 427,442 tokens / 14329 sentences training, 50368 tokens / 1655 sentences development, |
==== Inside ==== | ==== Inside ==== |