Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
pub-company:icon2009 [2009/10/23 14:17] stranak |
pub-company:icon2009 [2010/03/25 10:01] (current) stranak |
||
---|---|---|---|
Line 67: | Line 67: | ||
| **Tides+DP11-train-en** | 1402536 | 52947 | | | **Tides+DP11-train-en** | 1402536 | 52947 | | ||
| **Tides+DP11-train-hi** | 1434543 | 57131 | | | **Tides+DP11-train-hi** | 1434543 | 57131 | | ||
+ | | **tides.train+dictfilt-en** | ||
+ | | **tides.train+dictfilt-hi** | ||
+ | | **tides.train+DP11+dictfilt-en** | ||
+ | | **tides.train+DP11+dictfilt-hi** | ||
| **set1-en** | | **set1-en** | ||
| **set1-hi** | | **set1-hi** | ||
Line 72: | Line 76: | ||
| **set2-hi** | | **set2-hi** | ||
| **set3-en** | | **set3-en** | ||
- | | **set3-hi** | + | | **set3-hi** |
| **Tides-dev-en** | | **Tides-dev-en** | ||
| **Tides-dev-hi** | | **Tides-dev-hi** | ||
| **Tides-test-en** | | **Tides-test-en** | ||
| **Tides-test-hi** | | **Tides-test-hi** | ||
+ | |||
- set1 = danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | - set1 = danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | ||
- set2 = emille-11+danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | - set2 = emille-11+danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | ||
- set3 = emille-om+danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | - set3 = emille-om+danielpipes-11+agrocorp-11+wikiner2008+wikiner2009+acl2005 | ||
+ | - dictfilt = Shabdanjali from the web (with many errors, probably from wx-to-utf8). Filtered to get rid of the errors, then expanded entries with multiple meanings to separate entries, then filtered to keep onlu word that occur in the large Hindi monolingual corpus. | ||
+ | |||
+ | ^ | ||
+ | | | **tokens unseen in train** | ||
+ | | | ||
+ | | **Tides-test-en** | | ||
+ | | **Tides-test-hi** | | ||
+ | | **Tides-dev-en** | ||
+ | | **Tides-dev-hi** | ||
+ | |||
- | ^ | ||
- | | | **tokens unseen in train** | ||
- | | | ||
- | | **Tides-test-en** | | ||
- | | **Tides-test-hi** | | ||
- | | **Tides-dev-en** | ||
- | | **Tides-dev-hi** | ||