[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:interset:drivers [2008/03/31 22:14]
zeman de::conll
user:zeman:interset:drivers [2008/04/03 11:49]
zeman Restructuralization.
Line 38: Line 38:
  
 More than half of the time was consumed during testing for tuning tags containing the Sem feature. More than half of the time was consumed during testing for tuning tags containing the Sem feature.
 +
 +===== Danish (da) =====
 +
 +Tags of the Danish Dependency Treebank converted to CoNLL format. 144 tags with complex documentation in Danish.
 +
 +Total work time: about 7 hours
 +
 +===== English (en) =====
 +
 +Penn Treebank (45 atomic tags). Detailed classification of punctuation.
 +
 +Total work time: about 3 hours
  
 ===== German (de) ===== ===== German (de) =====
Line 58: Line 70:
 Work finished: 31.3.2008 Work finished: 31.3.2008
 Total work time: 10 min Total work time: 10 min
 +
 +===== Swedish (sv) =====
 +
 +Mamba tagset of Talbanken05. 48 tags, no morphosyntactic categories but detailed classification of auxiliary and modal verbs and punctuation.
 +
 +Total work time: about 3 hours
  
 ===== Time needed for tag set conversion ===== ===== Time needed for tag set conversion =====
Line 68: Line 86:
 Arabské značky (Otovy i Buckwalterovy, ještě bez Intersetu, 22.3.2006): Arabské značky (Otovy i Buckwalterovy, ještě bez Intersetu, 22.3.2006):
 4:45+1+1:40 = 7:25 4:45+1+1:40 = 7:25
- 
-Dánské značky DDT/Parole (144 značek s košatým popisem) 
-asi 7 hodin 
- 
-Švédské značky Mamba (48 značek) 
-asi 3 hodiny 
- 
-Penn Treebank (36 značek) 
-asi 3 hodiny, ale tady jsem to ještě neměřil, takže to je jen hrubý zpětný odhad 
  
 Hajičovy švédské značky Hajičovy švédské značky

[ Back to the navigation ] [ Back to the content ]