Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
user:zeman:treebanks:nl [2012/01/10 11:58] zeman Sample. |
user:zeman:treebanks:nl [2012/01/10 12:05] zeman Inside. |
||
---|---|---|---|
Line 43: | Line 43: | ||
==== Inside ==== | ==== Inside ==== | ||
- | CoNLL Alpino: | + | In the CoNLL version, the original |
- | | + | |
- | | + | |
- | The syntactic annotation is mostly identical to that of the Corpus | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | 3.6 Conversion | + | |
- | + | ||
- | Issues: | + | |
- | - head selection | + | |
- | - multi-word units | + | |
- | - discourse units | + | |
- | + | ||
- | + | ||
- | The original morphosyntactic tags have been converted to fit into the three columns (CPOS, POS and FEAT) of the CoNLL format. There //should// be a 1-1 mapping between the [[http:// | + | |
- | + | ||
- | The morphological analysis in the CoNLL 2006 version does not include lemmas (the original DTAG version does contain them). The morphosyntactic tags have been assigned (probably) manually. | + | |
- | + | ||
- | Some multi-word expressions have been collapsed into one token, using underscore as the joining character. This includes adverbially used prepositional phrases (e.g. i_lørdags = on Saturdays) but not named entities. | + | |
==== Sample ==== | ==== Sample ==== |