Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
user:zeman:treebanks:grc [2011/12/06 15:00] zeman Inside, sample and parsing. |
user:zeman:treebanks:grc [2011/12/06 16:04] (current) zeman |
||
---|---|---|---|
Line 36: | Line 36: | ||
==== Size ==== | ==== Size ==== | ||
- | AGDT contains | + | AGDT contains |
==== Inside ==== | ==== Inside ==== | ||
Line 42: | Line 42: | ||
The native file format of the treebank is based on XML. Greek letters are romanized using [[http:// | The native file format of the treebank is based on XML. Greek letters are romanized using [[http:// | ||
- | Morphological annotation consists of lemma and nine-character positional morphosyntactic | + | Morphological annotation consists of lemma and nine-character positional morphosyntactic |
The syntactic annotation style is very similar to that of the Prague Dependency Treebank. The syntactic tags (analytical functions) are almost identical, too. However, in AGDT some combined values are permitted that are not valid in PDT, e.g. '' | The syntactic annotation style is very similar to that of the Prague Dependency Treebank. The syntactic tags (analytical functions) are almost identical, too. However, in AGDT some combined values are permitted that are not valid in PDT, e.g. '' | ||
Line 111: | Line 111: | ||
</ | </ | ||
- | The same sentence converted to the CoNLL format, with Greek letters decoded: | + | The first sentence |
| 1 | ἄσημα | ἄσημος | a | a | pos=a< | | 1 | ἄσημα | ἄσημος | a | a | pos=a< | ||
Line 131: | Line 131: | ||
==== Parsing ==== | ==== Parsing ==== | ||
- | AGDT is an extremely nonprojective treebank, exceeding the nonprojectivity level found in other treebanks by an order of magnitude. 60469 out of the total 309,092 tokens are attached nonprojectively (19.56%). | + | AGDT is an extremely nonprojective treebank, exceeding the nonprojectivity level found in other treebanks by an order of magnitude. 60469 out of the total 308,882 tokens are attached nonprojectively (19.58%). |
I am not aware of any published evaluation of Ancient Greek parsing accuracy. | I am not aware of any published evaluation of Ancient Greek parsing accuracy. | ||