[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
user:zeman:treebanks:la [2012/01/08 14:33]
zeman vytvořeno
user:zeman:treebanks:la [2012/01/08 15:24] (current)
zeman Parsing.
Line 40: Line 40:
 ==== Inside ==== ==== Inside ====
  
-The native file format of the treebank is based on XML. Greek letters are romanized using [[http://www.tlg.uci.edu/encoding/quickbeta.pdf|Beta Code]], a romanization scheme used widely not only in the Perseus project. It can be mapped 1-1 on the original Greek letters in UTF-8; however, embedded non-Greek words (such as the lemmas “comma” and “other”) cannot be identified automatically (and we do not want to decode them).+The native file format of the treebank is based on XML.
  
 Morphological annotation consists of lemma and nine-character positional morphosyntactic tag. Disambiguation has been done manually (gold standard). Morphological annotation consists of lemma and nine-character positional morphosyntactic tag. Disambiguation has been done manually (gold standard).
  
-The syntactic annotation style is very similar to that of the Prague Dependency Treebank. The syntactic tags (analytical functions) are almost identical, too. However, in AGDT some combined values are permitted that are not valid in PDT, e.g. ''ATR_AP_ExD0_APOS''.+The syntactic annotation style is very similar to that of the Prague Dependency Treebank. The syntactic tags (analytical functions) are almost identical, too.
  
 ==== Sample ==== ==== Sample ====
Line 51: Line 51:
  
 <code xml><?xml version="1.0"?> <code xml><?xml version="1.0"?>
-<treebank version="1.2"+<treebank version="1.5"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:treebank="http://nlp.perseus.tufts.edu/syntax/treebank/1.5"  xmlns:treebank="http://nlp.perseus.tufts.edu/syntax/treebank/1.5"
- xsi:schemaLocation="http://nlp.perseus.tufts.edu/syntax/treebank/1.5 treebank-1.5.xsd" + xsi:schemaLocation="http://nlp.perseus.tufts.edu/syntax/treebank/1.5 treebank-1.5.xsd"> 
- xml:lang="grc"> + <sentence id="1" document_id="Perseus:text:1999.02.0002" subdoc="Book=2:chapter=1" span="Cum0:dare0"> 
- <date>Wed Sep 29 12:03:38 EDT 2010</date> + <word id="1" form="Cum" lemma="cum1" postag="c--------" head="20" relation="AuxC" /> 
- <annotator> + <word id="2" form="esset" lemma="sum1" postag="v3sisa---" head="1" relation="ADV" /> 
- <short>FrancescoM</short> + <word id="3" form="Caesar" lemma="Caesar1" postag="n-s---mn-" head="2" relation="SBJ" /> 
- <name>Francesco Mambrini</name> + <word id="4" form="in" lemma="in1" postag="r--------" head="2" relation="AuxP" /> 
- <address>Tufts University, Medford, MA, USA</address> + <word id="5" form="citeriore" lemma="citer1" postag="a-s---fbc" head="6" relation="ATR" /> 
- </annotator+ <word id="6" form="Gallia" lemma="Gallia1" postag="n-s---fb-" head="4" relation="ADV" /> 
- <sentence id="2185285" document_id="Perseus:text:1999.01.0003" subdoc="card=1" span="qeou\s0:.0"+ <word id="7" form="in" lemma="in1" postag="r--------" head="2" relation="AuxP" /> 
- <annotator>FrancescoM</annotator+ <word id="8" form="hibernis" lemma="hibernus1" postag="n-p---nb-" head="7" relation="ADV" /> 
- <word id="1" cid="32749174" form="qeou\s" lemma="qeo/s1" postag="n-p---ma-" head="3" relation="OBJ" /> + <word id="9" form="," lemma="comma1" postag="u--------" head="13" relation="AuxX" /> 
- <word id="2" cid="32749175" form="me\n" lemma="me/n1" postag="g--------" head="3" relation="AuxY" /> + <word id="10" form="ita" lemma="ita1" postag="d--------" head="2" relation="AuxY" /> 
- <word id="3" cid="32749176" form="ai)tw=" lemma="ai)te/w1" postag="v1spia---" head="0" relation="PRED" /> + <word id="11" form="uti" lemma="uti1" postag="c--------" head="10" relation="AuxC" /> 
- <word id="4" cid="32749177" form="tw=nd&apos;" lemma="o(/de1" postag="p-p---mg-" head="6" relation="ATR" /> + <word id="12" form="supra" lemma="supra1" postag="d--------" head="13" relation="ADV" /> 
- <word id="5" cid="32749178" form="a)pallagh\n" lemma="a)pallagh/1" postag="n-s---fa-" head="3" relation="OBJ" /> + <word id="13" form="demonstravimus" lemma="demonstro1" postag="v1pria---" head="11" relation="ADV" /> 
- <word id="6" cid="32749179" form="po/nwn" lemma="po/nos1" postag="n-p---mg-" head="5" relation="ATR_AP_ExD0_APOS" /> + <word id="14" form="," lemma="comma1" postag="u--------" head="13" relation="AuxX" /> 
- <word id="7" cid="32749180" form="froura=s" lemma="froura/1" postag="n-s---fg-" head="5" relation="ATR_AP_ExD0_APOS" /> + <word id="15" form="crebri" lemma="creber1" postag="a-p---mn-" head="18" relation="ATR" /> 
- <word id="8" cid="32749181" form="e)tei/as" lemma="e)/teios1" postag="a-s---fg-" head="7" relation="ATR" /> + <word id="16" form="ad" lemma="ad1" postag="r--------" head="19" relation="AuxP" /> 
- <word id="9" cid="32749182" form="mh=kos" lemma="mh=kos1" postag="n-s---na-" head="8" relation="ATR" /> + <word id="17" form="eum" lemma="is1" postag="p-s---ma-" head="16" relation="OBJ" /> 
- <word id="10" cid="32749183" form="," lemma="comma1" postag="u--------" head="21" relation="AuxX" /> + <word id="18" form="rumores" lemma="rumor1" postag="n-p---mn-" head="19" relation="SBJ" /> 
- <word id="11" cid="32749184" form="h(\n" lemma="o(/s1" postag="p-s---fa-" head="12" relation="OBJ" /> + <word id="19" form="adferebantur" lemma="affero1" postag="v3piip---" head="20" relation="PRED_CO" /> 
- <word id="12" cid="32749185" form="koimw/menos" lemma="koima/w1" postag="t-sppemn-" head="21" relation="ADV" /> + <word id="20" form="que" lemma="que1" postag="c--------" head="0" relation="COORD" /> 
- <word id="13" cid="32749186" form="ste/gais" lemma="ste/gh1" postag="n-p---fd-" head="12" relation="ADV" /> + <word id="21" form="litteris" lemma="littera1" postag="n-p---fb-" head="25" relation="ADV" /> 
- <word id="14" cid="32749187" form="*)atreidw=n" lemma="*)atrei/dhs1" postag="n-p---mg-" head="13" relation="ATR" /> + <word id="22" form="item" lemma="item1" postag="d--------" head="21" relation="AuxZ" /> 
- <word id="15" cid="32749188" form="a)/gkaqen" lemma="a)/gkaqen1" postag="d--------" head="16" relation="ADV_AP" /> + <word id="23" form="Labieni" lemma="Labienus1" postag="n-s---mg-" head="21" relation="ATR" /> 
- <word id="16" cid="32749189" form="," lemma="comma1" postag="u--------" head="12" relation="APOS" /> + <word id="24" form="certior" lemma="certus1" postag="a-s---mnc" head="25" relation="PNOM" /> 
- <word id="17" cid="32749190" form="kuno\s" lemma="ku/wn1" postag="n-s---mg-" head="18" relation="ATR" /> + <word id="25" form="fiebat" lemma="fio1" postag="v3s-ia---" head="20" relation="PRED_CO" /> 
- <word id="18" cid="32749191" form="di/khn" lemma="di/kh1" postag="n-s---fa-" head="16" relation="ADV_AP" /> + <word id="26" form="omnes" lemma="omnis1" postag="a-p---ma-" head="27" relation="ATR" /> 
- <word id="19" cid="32749192" form="," lemma="comma1" postag="u--------" head="16" relation="AuxX" /> + <word id="27" form="Belgas" lemma="Belgae1" postag="n-p---ma-" head="40" relation="SBJ" /> 
- <word id="20" cid="32749193" form="a)/strwn" lemma="a)/stron1" postag="n-p---ng-" head="23" relation="ATR" /> + <word id="28" form="," lemma="comma1" postag="u--------" head="34" relation="AuxX" /> 
- <word id="21" cid="32749194" form="ka/toida" lemma="ka/toida1" postag="v1sria---" head="7" relation="ATR" /> + <word id="29" form="quam" lemma="qui1" postag="p-s---fa-" head="31" relation="SBJ" /> 
- <word id="22" cid="32749195" form="nukte/rwn" lemma="nu/kteros1" postag="a-p---ng-" head="20" relation="ATR" /> + <word id="30" form="tertiam" lemma="tertius1" postag="a-s---fa-" head="33" relation="ATR" /> 
- <word id="23" cid="32749196" form="o(mh/gurin" lemma="o(mh/guris1" postag="n-s---fa-" head="25" relation="OBJ_AP_CO" /> + <word id="31" form="esse" lemma="sum1" postag="v--pna---" head="34" relation="OBJ" /> 
- <word id="24" cid="32749197" form="," lemma="comma1" postag="u--------" head="25" relation="AuxX" /> + <word id="32" form="Galliae" lemma="Gallia1" postag="n-s---fg-" head="33" relation="ATR" /> 
- <word id="25" cid="32749198" form="kai\" lemma="kai/1" postag="c--------" head="38" relation="COORD" /> + <word id="33" form="partem" lemma="pars1" postag="n-s---fa-" head="31" relation="PNOM" /> 
- <word id="26" cid="32749199" form="tou\s" lemma="o(1" postag="l-p---ma-" head="33" relation="ATR" /> + <word id="34" form="dixeramus" lemma="dico2" postag="v1plia---" head="27" relation="ATR" /> 
- <word id="27" cid="32749200" form="fe/rontas" lemma="fe/rw1" postag="t-pppama-" head="33" relation="ATR" /> + <word id="35" form="," lemma="comma1" postag="u--------" head="34" relation="AuxX" /> 
- <word id="28" cid="32749201" form="xei=ma" lemma="xei=ma1" postag="n-s---na-" head="29" relation="OBJ_CO" /> + <word id="36" form="contra" lemma="contra1" postag="r--------" head="39" relation="AuxP" /> 
- <word id="29" cid="32749202" form="kai\" lemma="kai/1" postag="c--------" head="27" relation="COORD" /> + <word id="37" form="populum" lemma="populus1" postag="n-s---ma-" head="36" relation="ADV" /> 
- <word id="30" cid="32749203" form="qe/ros" lemma="qe/ros1" postag="n-s---na-" head="29" relation="OBJ_CO" /> + <word id="38" form="Romanum" lemma="Romanus1" postag="a-s---ma-" head="37" relation="ATR" /> 
- <word id="31" cid="32749204" form="brotoi=s" lemma="broto/s1" postag="n-p---md-" head="27" relation="OBJ" /> + <word id="39" form="coniurare" lemma="conjuro1" postag="v--pna---" head="40" relation="OBJ_CO" /> 
- <word id="32" cid="32749205" form="lamprou\s" lemma="lampro/s1" postag="a-p---ma-" head="33" relation="ATR" /> + <word id="40" form="que" lemma="que1" postag="c--------" head="24" relation="COORD" /> 
- <word id="33" cid="32749206" form="duna/stas" lemma="duna/sths1" postag="n-p---ma-" head="34" relation="OBJ_AP_CO" /> + <word id="41" form="obsides" lemma="obses1" postag="n-p---ma-" head="44" relation="OBJ" /> 
- <word id="34" cid="32749207" form="," lemma="comma1" postag="---------" head="25" relation="APOS" /> + <word id="42" form="inter" lemma="inter1" postag="r--------" head="44" relation="AuxP" /> 
- <word id="35" cid="32749208" form="e)mpre/pontas" lemma="e)mpre/pw1" postag="t-pppama-" head="37" relation="ATR" /> + <word id="43" form="se" lemma="sui1" postag="p-p---ma-" head="42" relation="OBJ" /> 
- <word id="36" cid="32749209" form="ai)qe/ri" lemma="ai)qh/r1" postag="n-s---md-" head="35" relation="OBJ" /> + <word id="44" form="dare" lemma="do1" postag="v--pna---" head="40" relation="OBJ_CO" />
- <word id="37" cid="32749210" form="[a)ste/ras" lemma="a)sth/r1" postag="n-p---ma-" head="34" relation="OBJ_AP_CO" /> +
- <word id="38" cid="32749211" form="," lemma="comma1" postag="---------" head="21" relation="APOS" /> +
- <word id="39" cid="32749212" form="o(/tan" lemma="o(/tan1" postag="c--------" head="43" relation="AuxC" /> +
- <word id="40" cid="32749213" form="fqi/nwsin" lemma="fqi/w1" postag="v3ppsa---" head="39" relation="OBJ_AP_CO" /> +
- <word id="41" cid="32749214" form="," lemma="comma1" postag="---------" head="43" relation="AuxX" /> +
- <word id="42" cid="32749215" form="a)ntola/s" lemma="a)natolh/1" postag="n-p---fa-" head="43" relation="OBJ_AP_CO" /> +
- <word id="43" cid="32749216" form="te" lemma="te1" postag="g--------" head="38" relation="COORD" /> +
- <word id="44" cid="32749217" form="tw=n]" lemma="o(" postag="p-p---mg-" head="42" relation="ATR" /> +
- <word id="45" cid="32749218" form="." lemma="other" postag="---------" head="0" relation="AuxK" />+
  </sentence></code>  </sentence></code>
  
-The first sentence of the corpus converted to the CoNLL format, with Greek letters decoded (note that this is not the same sentence as above because the conversion script reorders sentences according to their sentence id):+The first sentence of the corpus converted to the CoNLL format:
  
-| 1 | ἄσημα ἄσημος | pos=a<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=p<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=n<nowiki>|</nowiki>cas=a<nowiki>|</nowiki>deg=- OBJ | +| 1 | Cum cum1 <nowiki>pos=c|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 20 | AuxC | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-δ’ δέ1 pos=g<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- | AuxY | _ | _ | +| 2 | esset | sum1 | v | v | <nowiki>pos=v|per=3|num=s|ten=i|mod=s|voi=a|gen=-|cas=-|deg=-</nowiki> | 1 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-αὐτῶν αὐτός | pos=a<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=p<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=n<nowiki>|</nowiki>cas=g<nowiki>|</nowiki>deg=- | ATR | _ | _ | +| 3 | Caesar | Caesar1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=n|deg=-</nowiki> | 2 | SBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | 
-αὐτίκ’ αὐτίκα1 | d | d | pos=d<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- ADV | _ | _ | +in in1 <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 2 | AuxP | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-ἀγνοίᾳ ἄγνοια1 | pos=n<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>cas=d<nowiki>|</nowiki>deg=- | ADV | _ | _ | +| 5 | citeriore | citer1 | a | a | <nowiki>pos=a|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=b|deg=c</nowiki> | 6 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-λαβὼν λαμβάνω1 pos=t<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=a<nowiki>|</nowiki>mod=p<nowiki>|</nowiki>voi=a<nowiki>|</nowiki>gen=m<nowiki>|</nowiki>cas=n<nowiki>|</nowiki>deg=- | | ADV | _ | _ | +| 6 | Gallia | Gallia1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=b|deg=-</nowiki> | 4 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-ἔσθει ἔσθω1 | pos=v<nowiki>|</nowiki>per=3<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=p<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>voi=a<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- | PRED | _ | _ | +| 7 | in | in1 | r | r | <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> AuxP <nowiki>_</nowiki> <nowiki>_</nowiki> 
-βορὰν βορά1 | n | n | pos=n<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>cas=a<nowiki>|</nowiki>deg=- OBJ | _ | _ | +hibernis hibernus1 <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=n|cas=b|deg=-</nowiki> | 7 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-ἄσωτον ἄσωτος | a | a | pos=a<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=f<nowiki>|</nowiki>cas=a<nowiki>|</nowiki>deg=- ATR | +| 9 <nowiki>,</nowiki> | comma1 | u | u | <nowiki>pos=u|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 13 | AuxX | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-10 comma1 pos=u<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- 11 | AuxX | _ | _ | +| 10 | ita | ita1 | d | d | <nowiki>pos=d|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 2 | AuxY | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-11 ὡς ὡς | pos=d<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- | 9 | AuxC | _ | _ +| 11 | uti | uti1 | c | c | <nowiki>pos=c|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> 10 AuxC <nowiki>_</nowiki> <nowiki>_</nowiki> 
-12 ὁρᾷς ὁράω1 | v | v | pos=v<nowiki>|</nowiki>per=2<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=p<nowiki>|</nowiki>mod=i<nowiki>|</nowiki>voi=a<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- 11 ADV | +12 supra supra1 | d | d | <nowiki>pos=d|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 13 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-13 comma1 pos=u<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- | 11 AuxX | _ | _ | +| 13 | demonstravimus | demonstro1 | v | v | <nowiki>pos=v|per=1|num=p|ten=r|mod=i|voi=a|gen=-|cas=-|deg=-</nowiki> | 11 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-14 γένει γένος | pos=n<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=s<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=n<nowiki>|</nowiki>cas=d<nowiki>|</nowiki>deg=- ADV | +| 14 | <nowiki>,</nowiki> | comma1 | u | u | <nowiki>pos=u|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki>13 AuxX <nowiki>_</nowiki> <nowiki>_</nowiki> 
-15 period1 pos=u<nowiki>|</nowiki>per=-<nowiki>|</nowiki>num=-<nowiki>|</nowiki>ten=-<nowiki>|</nowiki>mod=-<nowiki>|</nowiki>voi=-<nowiki>|</nowiki>gen=-<nowiki>|</nowiki>cas=-<nowiki>|</nowiki>deg=- | AuxK | _ | _ |+15 crebri creber1 <nowiki>pos=a|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=n|deg=-</nowiki> | 18 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 16 | ad | ad1 | r | r | <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 19 | AuxP | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 17 | eum | is1 | p | p | <nowiki>pos=p|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 16 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 18 | rumores | rumor1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=n|deg=-</nowiki> 19 SBJ <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +19 adferebantur affero1 | <nowiki>pos=v|per=3|num=p|ten=i|mod=i|voi=p|gen=-|cas=-|deg=-</nowiki> | 20 | <nowiki>PRED_CO</nowiki> | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 20 | que | que1 | c | c | <nowiki>pos=c|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 0 | COORD | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 21 | litteris | littera1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=f|cas=b|deg=-</nowiki> 25 | ADV | <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +22 item item1 <nowiki>pos=d|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 21 | AuxZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 23 | Labieni | Labienus1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=g|deg=-</nowiki> | 21 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 24 | certior | certus1 | a | a | <nowiki>pos=a|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=n|deg=c</nowiki> | 25 | PNOM | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 25 | fiebat | fio1 | v | v | <nowiki>pos=v|per=3|num=s|ten=-|mod=i|voi=a|gen=-|cas=-|deg=-</nowiki> | 20 | <nowiki>PRED_CO</nowiki> <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 26 | omnes | omnis1 | a | a | <nowiki>pos=a|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> 27 ATR <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +27 Belgas Belgae1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 40 | SBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 28 | <nowiki>,</nowiki> | comma1 | u | u | <nowiki>pos=u|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 34 | AuxX | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 29 | quam | qui1 | p | p | <nowiki>pos=p|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=a|deg=-</nowiki>31 SBJ <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +30 tertiam tertius1 | a | a | <nowiki>pos=a|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=a|deg=-</nowiki> | 33 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 31 | esse | sum1 | v | v | <nowiki>pos=v|per=-|num=-|ten=p|mod=n|voi=a|gen=-|cas=-|deg=-</nowiki> | 34 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 32 | Galliae | Gallia1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=g|deg=-</nowiki> | 33 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> | 
 +33 partem pars1 <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=f|cas=a|deg=-</nowiki> | 31 | PNOM | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 34 | dixeramus | dico2 | v | v | <nowiki>pos=v|per=1|num=p|ten=l|mod=i|voi=a|gen=-|cas=-|deg=-</nowiki> | 27 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 35 | <nowiki>,</nowiki> | comma1 | u | u | <nowiki>pos=u|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki>34 | AuxX | <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +36 contra contra1 <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 39 | AuxP | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 37 | populum | populus1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 36 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 38 | Romanum | Romanus1 | a | a | <nowiki>pos=a|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 37 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki>
 +39 coniurare conjuro1 | v | v | <nowiki>pos=v|per=-|num=-|ten=p|mod=n|voi=a|gen=-|cas=-|deg=-</nowiki> | 40 | <nowiki>OBJ_CO</nowiki> | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 40 | que | que1 | c | c | <nowiki>pos=c|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 24 | COORD | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 41 | obsides | obses1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 44 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | 
 +42 inter inter1 <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> | 44 | AuxP | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 43 | se | sui1 | p | p | <nowiki>pos=p|per=-|num=p|ten=-|mod=-|voi=-|gen=m|cas=a|deg=-</nowiki> | 42 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 44 | dare | do1 | v | v | <nowiki>pos=v|per=-|num=-|ten=p|mod=n|voi=a|gen=-|cas=-|deg=-</nowiki> | 40 <nowiki>OBJ_CO</nowiki> <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 + 
 +The first sentence of the HamleDT test data in the CoNLL format: 
 + 
 +| 1 | In | in1 | r | r | <nowiki>pos=r|per=-|num=-|ten=-|mod=-|voi=-|gen=-|cas=-|deg=-</nowiki> AuxP <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +nova novus1 <nowiki>pos=a|per=-|num=p|ten=-|mod=-|voi=-|gen=n|cas=a|deg=-</nowiki> | 8 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 3 | fert | fero1 | v | v | <nowiki>pos=v|per=3|num=s|ten=p|mod=i|voi=a|gen=-|cas=-|deg=-</nowiki> | 0 | PRED | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 4 | animus | animus1 | n | n | <nowiki>pos=n|per=-|num=s|ten=-|mod=-|voi=-|gen=m|cas=n|deg=-</nowiki> | 3 | SBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> | 
 +mutatas muto1 <nowiki>pos=t|per=-|num=p|ten=r|mod=p|voi=p|gen=f|cas=a|deg=-</nowiki> | 7 | ATR | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 6 | dicere | dico2 | v | v | <nowiki>pos=v|per=-|num=-|ten=p|mod=n|voi=a|gen=-|cas=-|deg=-</nowiki> | 3 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 7 | formas | forma1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=f|cas=a|deg=-</nowiki> | 6 | OBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
 +| 8 | corpora | corpus1 | n | n | <nowiki>pos=n|per=-|num=p|ten=-|mod=-|voi=-|gen=n|cas=a|deg=-</nowiki> OBJ <nowiki>_</nowiki> <nowiki>_</nowiki> |
  
 ==== Parsing ==== ==== Parsing ====
  
-AGDT is an extremely nonprojective treebank, exceeding the nonprojectivity level found in other treebanks by an order of magnitude60469 out of the total 308,882 tokens are attached nonprojectively (19.58%).+LDT is an extremely nonprojective treebank. 4042 out of the total 53143 tokens are attached nonprojectively (7.61%).
  
-I am not aware of any published evaluation of Ancient Greek parsing accuracy.+I am not aware of any published evaluation of Latin parsing accuracy.
  

[ Back to the navigation ] [ Back to the content ]