Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
user:zeman:treebanks:fi [2011/12/05 14:27] zeman Domain. |
user:zeman:treebanks:fi [2011/12/05 14:37] zeman Size. |
||
---|---|---|---|
Line 36: | Line 36: | ||
==== Size ==== | ==== Size ==== | ||
- | All four parts of the treebank together contain 9491 tokens in 1315 sentences, yielding | + | TDT contains 58576 tokens in 4307 sentences, yielding |
- | + | ||
- | ^ File ^ Sentences ^ Terminals ^ Average t/s ^ | + | |
- | | arborest.xml | 175 | 2451 | 14.01 | | + | |
- | | piialaused.xml | 732 | 4505 | 6.15 | | + | |
- | | ratsepalaused.xml | 388 | 2348 | 6.05 | | + | |
- | | sul.xml | 20 | 187 | 9.35 | | + | |
- | | **total** | **1315** | **9491** | **7.22** | | + | |
- | | training | 1184 | 8535 | 7.21 | | + | |
- | | test | 131 | 956 | 7.30 | | + | |
==== Inside ==== | ==== Inside ==== |