[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:interset:drivers [2008/04/04 09:09]
zeman 1st person in Portuguese.
user:zeman:interset:drivers [2008/04/25 09:01]
zeman Portuguese work time summary.
Line 98: Line 98:
 Work finished: 31.3.2008 Work finished: 31.3.2008
 Total work time: 10 min Total work time: 10 min
- 
- 
- 
- 
- 
  
 ===== Portuguese (pt) ===== ===== Portuguese (pt) =====
Line 110: Line 105:
 http://visl.sdu.dk/visl/pt/info/symbolset-floresta.html http://visl.sdu.dk/visl/pt/info/symbolset-floresta.html
 http://en.wikipedia.org/wiki/Portuguese_grammar http://en.wikipedia.org/wiki/Portuguese_grammar
 +
 +Work started: 2.4.2008
 +Work finished: 24.4.2008
 +Total work time: 28:18 h
 +
 +The CoNLL version of the Floresta tagset was a real pain. Not only is the tagset complex with many features, some of them strangely overlapping, some of them undocumented. There was also a terrible proportion of noise, typos or otherwise introduced errors in annotation.
  
 | **Feature** | **Explanation** | **Examples** | | **Feature** | **Explanation** | **Examples** |
Line 249: Line 250:
 | <prop>M | noise; should be two features | | | <prop>M | noise; should be two features | |
 | <prparg> | noise; should be <co-prparg> | | | <prparg> | noise; should be <co-prparg> | |
-| R | noise | 2 occurrences |+| R | noise; should be PR | 2 occurrences |
 | recohidas> | noise; should be <ALT> | recolhidas | | recohidas> | noise; should be <ALT> | recolhidas |
 | <rel><ks> | noise; should be two features | | | <rel><ks> | noise; should be two features | |

[ Back to the navigation ] [ Back to the content ]