Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
user:zeman:treebanks:pt [2012/01/11 11:22] zeman Sample. |
user:zeman:treebanks:pt [2012/01/11 11:34] (current) zeman Parsing results. |
||
---|---|---|---|
Line 46: | Line 46: | ||
==== Inside ==== | ==== Inside ==== | ||
- | Texts from Portugal and Brasil. | + | The corpus contains texts from Portugal and Brazil. The texts were automatically parsed using the PALAVRAS parser (Bick 2000: Eckhard Bick. The Parsing System " |
- | The texts were automatically parsed using the PALAVRAS parser (Bick 2000: Eckhard Bick. The Parsing System " | + | Morphological annotation includes lemmas. In the CoNLL version, the original Floresta tags were converted to fit the '' |
- | In the CoNLL version, the original POS tags from the Alpino Treebank were replaced by POS tags from the Memory-based part-of-speech tagger using the WOTAN tagset, which is described in the file '' | + | Multi-word expressions have been concatenated into one token, using underscore as the joining character (e.g. "7_e_Meio", "Hillary_Clinton"). |
- | + | ||
- | Multi-word expressions have been concatenated into one token, using underscore as the joining character (e.g. "Economische_en_Monetaire_Unie"). They have special part-of-speech tags '' | + | |
==== Sample ==== | ==== Sample ==== | ||
Line 117: | Line 115: | ||
==== Parsing ==== | ==== Parsing ==== | ||
- | Nonprojectivities in Alpino are quite frequent. 10858 of the 200,654 tokens in the CoNLL 2006 version are attached nonprojectively (5.41%). | + | Bosque is a mildly nonprojective treebank. 2778 of the 212,545 tokens in the CoNLL 2006 version are attached nonprojectively (1.31%). |
- | The results of the CoNLL 2006 shared task are [[http:// | + | The results of the CoNLL 2006 shared task are [[http:// |
^ Parser (Authors) ^ LAS ^ UAS ^ | ^ Parser (Authors) ^ LAS ^ UAS ^ | ||
- | | MST (McDonald et al.) | 79.19 | 83.57 | | + | | MST (McDonald et al.) | 86.82 | 91.36 | |
- | | Riedel | + | | Malt (Nivre |
- | | Basis (John O' | + | | Nara (Yuchang Cheng) | 85.07 | 90.30 | |
- | | Malt (Nivre et al.) | 78.59 | 81.35 | | + | |