[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:treebanks:te [2012/03/22 11:30]
zeman Size.
user:zeman:treebanks:te [2012/03/22 11:34]
zeman Training data size (both sentences and words) was identical in ICON 2009 and 2010.
Line 43: Line 43:
 ==== Size ==== ==== Size ====
  
-HyDT-Telugu shows dependencies between chunks, not words. The node/tree ratio is thus much lower than in other treebanks. The ICON 2009 version came with a data split into three parts: training, development and test+HyDT-Telugu shows dependencies between chunks, not words. The node/tree ratio is thus much lower than in other treebanks. The ICON 2009 version came with a data split into three parts: training, development and test; the same data was also distributed for ICON 2010:
- +
-^ Part ^ Sentences ^ Chunks ^ Ratio ^ +
-| Training | 980 | 6449 | 6.58 | +
-| Development | 150 | 811 | 5.41 | +
-| Test | 150 | 961 | 6.41 | +
-| TOTAL | 1280 | 8221 | 6.42 | +
- +
-The ICON 2010 version came with a data split into three parts: training, development and test:+
  
 ^ Part ^ Sentences ^ Chunks ^ Ratio ^ Words ^ Ratio ^ ^ Part ^ Sentences ^ Chunks ^ Ratio ^ Words ^ Ratio ^

[ Back to the navigation ] [ Back to the content ]