[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
user:zeman:treebanks:ta [2012/03/22 10:37]
zeman
user:zeman:treebanks:ta [2012/03/22 10:49]
zeman Sample.
Line 25: Line 25:
     * //no separate citation//     * //no separate citation//
   * Principal publications   * Principal publications
-    * Loganathan Ramasamy, Zdeněk Žabokrtský: Tamil Dependency Parsing: Results using Rule Based and Corpus Based Approaches. In: //Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011) – Volume Part I//, pages 82-95, Tokyo, Japan, 2011, published by Springer Berlin / Heidelberg, ISBN 978-3-642-19399-6.+    * Loganathan Ramasamy, Zdeněk Žabokrtský: [[http://www.springerlink.com/content/w18v7621070h51g1/|Tamil Dependency Parsing: Results using Rule Based and Corpus Based Approaches]]. In: //Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011) – Volume Part I//, pages 82-95, Tokyo, Japan, 2011, published by Springer Berlin / Heidelberg, ISBN 978-3-642-19399-6.
     * Loganathan Ramasamy, Zdeněk Žabokrtský: Prague Dependency Style Treebank for Tamil. In: //Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012)//, İstanbul, Turkey, 2012     * Loganathan Ramasamy, Zdeněk Žabokrtský: Prague Dependency Style Treebank for Tamil. In: //Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012)//, İstanbul, Turkey, 2012
   * Documentation   * Documentation
     * [[http://ufal.mff.cuni.cz/~ramasamy/tamiltb/0.1/morph_annotation.html|Morphological annotation]]     * [[http://ufal.mff.cuni.cz/~ramasamy/tamiltb/0.1/morph_annotation.html|Morphological annotation]]
     * [[http://ufal.mff.cuni.cz/~ramasamy/tamiltb/0.1/dependency_annotation.html|Syntactic annotation]]     * [[http://ufal.mff.cuni.cz/~ramasamy/tamiltb/0.1/dependency_annotation.html|Syntactic annotation]]
 +    * Loganathan Ramasamy, Zdeněk Žabokrtský: [[http://ufal.mff.cuni.cz/~ramasamy/papers/2011-TamilTB-TR.pdf|Tamil Dependency Treebank (TamilTB) – 0.1 Annotation Manual]]. Technical Report TR-2011-42, ÚFAL MFF UK, Praha, Czechia, 2011
  
 ==== Domain ==== ==== Domain ====
Line 49: Line 50:
 ==== Sample ==== ==== Sample ====
  
-The first two sentences of the CoNLL 2006 training data:+The first sentence of the CoNLL version of training data:
  
-| 1 | غِيابُ_giyAbu غِياب_giyAb | N | case=1<nowiki>|</nowiki>def=R ExD | _ | _ | +| 1 | cennai cennai | N | <nowiki>NEN-3SN--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki>AAdjn <nowiki>_</nowiki> <nowiki>_</nowiki> 
-| 2 | فُؤاد_fu&Ad فُؤاد_fu&Ad | _ | Atr | _ | _ | +| 2 | arukE arukE <nowiki>PP-------</nowiki> <nowiki>_</nowiki> 18 AuxP <nowiki>_</nowiki> <nowiki>_</nowiki> 
-| 3 | كَنْعان_kanoEAn كَنْعان_kanoEAn | Atr | _ | _ | +| 3 | sri sri <nowiki>NEN-3SN--</nowiki> <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> | 4 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-| |||||||||| +perumpuTUril perumpuTUr <nowiki>NEL-3SN--</nowiki> <nowiki>Cas=L|Per=3|Num=S|Gen=N</nowiki> 18 | AAdjn | <nowiki>_</nowiki> | <nowiki>_</nowiki> 
-فُؤاد_fu&Ad فُؤاد_fu&Ad | Atr | _ | _ | +kirIn kirIn <nowiki>NEN-3SN--</nowiki> <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> | 6 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-كَنْعان_kanoEAn كَنْعان_kanoEAn Sb | _ | _ | +pIltu pIltu <nowiki>NEN-3SN--</nowiki> <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> | 11 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-،_, ،_, | _ | | AuxG | _ | _ | +<nowiki>(</nowiki> <nowiki>(</nowiki> <nowiki>Z:-------</nowiki> <nowiki>_</nowiki> | AuxG | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-رائِد_rA}id رائِد_rA}id | _ | | Atr | _ | _ | +wavIna wavInam <nowiki>JJ-------</nowiki> <nowiki>_</nowiki> | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-القِصَّة_AlqiS~ap قِصَّة_qiS~ap N | N gen=F<nowiki>|</nowiki>num=S<nowiki>|</nowiki>def=D Atr | _ | _ | +<nowiki>)</nowiki> <nowiki>)</nowiki> | <nowiki>Z:-------</nowiki> <nowiki>_</nowiki>AuxG <nowiki>_</nowiki> <nowiki>_</nowiki> 
-القَصِيرَةِ_AlqaSiyrapi قَصِير_qaSiyr A | gen=F<nowiki>|</nowiki>num=S<nowiki>|</nowiki>case=2<nowiki>|</nowiki>def=D | | Atr | _ | _ | +10 vimAna vimAnam | <nowiki>NO--3SN--</nowiki> | <nowiki>Per=3|Num=S|Gen=N</nowiki> | 11 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-فِي_fiy فِي_fiy | _ | AuxP | _ | _ | +| 11 | wilaiyaTTukkukk | wilaiyam | N <nowiki>NND-3SN--</nowiki> | <nowiki>Cas=D|Per=3|Num=S|Gen=N</nowiki> | 12 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-لُبْنانِ_lubonAni لُبْنان_lubonAn case=2<nowiki>|</nowiki>def=R Atr | _ | _ | +12 Ana Aku <nowiki>Tg-------</nowiki> <nowiki>_</nowiki> 13 Atr <nowiki>_</nowiki> <nowiki>_</nowiki> 
-رَحَلَ_raHala رَحَل-َ_raHal-V | VP pers=3<nowiki>|</nowiki>gen=M<nowiki>|</nowiki>num=S | Pred | _ | _ | +13 wilam wilam <nowiki>NNN-3SN--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki>18 Sb <nowiki>_</nowiki> <nowiki>_</nowiki> 
-10 مَساءَ_masA'مَساء_masA' | _ | Adv | _ | _ | +14 yArukkum yAr | R | <nowiki>RBD-3SA--</nowiki> <nowiki>Cas=D|Per=3|Num=S|Gen=A</nowiki> | 15 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki> | 
-11 أَمْسِ_>amosi أَمْسِ_>amosi 10 Atr | _ | _ | +| 15 | pATippu | pATippu | N | <nowiki>NNN-3SN--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> 16 Comp | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-12 عَن_Ean عَن_Ean | P | AuxP | _ | _ | +16 illATa il <nowiki>PP-------</nowiki> <nowiki>_</nowiki> 17 AuxP <nowiki>_</nowiki> <nowiki>_</nowiki> 
-13 81_81 81_81 12 Adv | _ | _ | +17 vakaiyil | vakai | N | <nowiki>NNL-3SN--</nowiki> | <nowiki>Cas=L|Per=3|Num=S|Gen=N</nowiki> | 18 | AAdjn | <nowiki>_</nowiki> | <nowiki>_</nowiki> | 
-14 عاماً_EAmAF عام_EAm | N | N | gen=M<nowiki>|</nowiki>num=S<nowiki>|</nowiki>case=4<nowiki>|</nowiki>def=13 Atr | _ | _ | +18 etukkap etu <nowiki>Vu-T---AA</nowiki> | <nowiki>Ten=T|Voi=A|Neg=A</nowiki> | 20 | Obj | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-15 ._._. | | _ | 0 | AuxK | _ | _ |+19 patum patu <nowiki>VR-F3SNPA</nowiki> | <nowiki>Ten=F|Per=3|Num=S|Gen=N|Voi=P|Neg=A</nowiki> 18 AuxV <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +20 enRu en <nowiki>Tt-T----A</nowiki> <nowiki>Ten=T|Neg=A</nowiki> 23 AuxC | <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +21 muTalvar muTalvar | N | <nowiki>NNN-3SH--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=H</nowiki> | 22 | Atr <nowiki>_</nowiki> | <nowiki>_</nowiki>
 +| 22 | karuNAwiTi | karuNAwiTi | N | <nowiki>NEN-3SH--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=H</nowiki> | 23 | Sb | <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +| 23 | uRuTiyaLiTT | uRuTiyaLi | V <nowiki>Vt-T---AA</nowiki> | <nowiki>Ten=T|Voi=A|Neg=A</nowiki> 0 | Pred | <nowiki>_</nowiki> <nowiki>_</nowiki> 
 +24 uLLAr | uL | V | <nowiki>VR-T3SHAA</nowiki> | <nowiki>Ten=T|Per=3|Num=S|Gen=H|Voi=A|Neg=A</nowiki> | 23 | AuxV | <nowiki>_</nowiki> <nowiki>_</nowiki>
 +| 25 | <nowiki>.</nowiki> <nowiki>.</nowiki> <nowiki>Z#-------</nowiki> | <nowiki>_</nowiki> | 0 | AuxK | <nowiki>_</nowiki> <nowiki>_</nowiki> |
  
-The first sentence of the CoNLL 2006 test data:+The first sentence of the CoNLL version of test data:
  
-| 1 | اِتِّفاقٌ_Ait~ifAqN اِتِّفاق_Ait~ifAq | N | N | case=1<nowiki>|</nowiki>def=ExD _ | _ | +| 1 | pikAr pikAr | N | <nowiki>NEN-3SN--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> | 2 | Atr <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 2 | بَيْنَ_bayona | بَيْنَ_bayona | P | P | _ | 1 | AuxP | _ | _ | +| 2 | iliruwTu iliruwTu | <nowiki>PP-------</nowiki> | <nowiki>_</nowiki> | 4 | AuxP | <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 3 | لُبْنانِ_lubonAni | لُبْنان_lubonAn | Z | Z | case=2<nowiki>|</nowiki>def=R | 4 | Atr | _ | _ +ErALamAna ErALamAna | <nowiki>JJ-------</nowiki> | <nowiki>_</nowiki>| Atr | <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 4 | وَ_wa | وَ_wa | C | C | _ | 2 | Coord | +iLainjarkaL iLainjar | N | <nowiki>NNN-3PA--</nowiki> | <nowiki>Cas=N|Per=3|Num=P|Gen=A</nowiki>| Sb | <nowiki>_</nowiki> <nowiki>_</nowiki> 
-| 5 | سُورِيَّةٍ_suwriy~apK | سُورِيا_suwriyA | Z | Z | gen=F<nowiki>|</nowiki>num=S<nowiki>|</nowiki>case=2<nowiki>|</nowiki>def=I | 4 | Atr | _ | _ | +vElai vElai | N | <nowiki>NNN-3SN--</nowiki> | <nowiki>Cas=N|Per=3|Num=S|Gen=N</nowiki> | 6 | Obj | <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 6 | عَلَى_EalaY | عَلَى_EalaY | P | P | _ | 1 | AuxP | _ | _ | +TEti TEtu <nowiki>Vt-T---AA</nowiki> | <nowiki>Ten=T|Voi=A|Neg=A</nowiki> | 9 | AAdjn | <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 7 | رَفْعِ_rafoEi | رَفْع_rafoE | N | N | case=2<nowiki>|</nowiki>def=R | 6 | Atr | _ | _ +| 7 | veLi veLi | <nowiki>JJ-------</nowiki> <nowiki>_</nowiki> | 8 | Atr | <nowiki>_</nowiki> <nowiki>_</nowiki>
-مُسْتَوَى_musotawaY مُسْتَوَى_musotawaY N | _ | 7 | Atr | _ | _ | +| 8 | mAwilangkaLukku mAwilam <nowiki>NND-3PN--</nowiki> | <nowiki>Cas=D|Per=3|Num=P|Gen=N</nowiki> | 9 | AAdjn <nowiki>_</nowiki> <nowiki>_</nowiki> 
-| 9 | التَبادُلِ_AltabAduli | تَبادُل_tabAdul | N | N | case=2<nowiki>|</nowiki>def=D 8 | Atr | _ | _ | +kutipeyarwTu kutipeyar <nowiki>Vt-T---AA</nowiki> | <nowiki>Ten=T|Voi=A|Neg=A</nowiki> | 0 | Pred | <nowiki>_</nowiki> | <nowiki>_</nowiki>
-| 10 | التِجارِيِّ_AltijAriy~i | تِجارِيّ_tijAriy~ | A | A | case=2<nowiki>|</nowiki>def=D | Atr | _ | _ | +10 varukinRanar varu <nowiki>VR-P3PHAA</nowiki> | <nowiki>Ten=P|Per=3|Num=P|Gen=H|Voi=A|Neg=A</nowiki> AuxV | <nowiki>_</nowiki><nowiki>_</nowiki> 
-| 11 | إِلَى_<ilaY | إِلَى_<ilaY P | P | _ | 7 | AuxP | _ | _ | +11 | <nowiki>.</nowiki> <nowiki>.</nowiki> | Z | <nowiki>Z#-------</nowiki> | <nowiki>_</nowiki>AuxK | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
-| 12 | 500_500 | 500_500 | Q | Q | _ | 11 | Atr | _ | _ | +
-| 13 | مِلْيُونِ_miloyuwni | مِلْيُون_miloyuwn | N | N | case=2<nowiki>|</nowiki>def=R | 12 | Atr | _ | _ +
-14 دُولارٍ_duwlArK دُولار_duwlAr | N | N | case=2<nowiki>|</nowiki>def=I | 13 | Atr | _ | _ | +
- +
-The first sentence of the CoNLL 2007 training data: +
- +
-| 1 | تَعْدادُ | تَعْداد_1 | N | N- Case=1<nowiki>|</nowiki>Defin=R | Sb | _ | _ | +
-سُكّانِ ساكِن_1 | N | N| Case=2<nowiki>|</nowiki>Defin=R | 1 | Atr | _ | _ | +
-| 3 | 22 | [DEFAULT] | Q | Q- | _ | 2 | Atr | _ | _ | +
-| 4 | دَوْلَةً | دَوْلَة_1 | N | N- | Gender=F<nowiki>|</nowiki>Number=S<nowiki>|</nowiki>Case=4<nowiki>|</nowiki>Defin=I | 3 | Atr | _ | _ +
-عَرَبِيَّةً عَرَبِيّ_1 A| Gender=F<nowiki>|</nowiki>Number=S<nowiki>|</nowiki>Case=4<nowiki>|</nowiki>Defin=I | 4 | Atr | _ | _ +
-| 6 | سَ | سَ_FUT | F | F- | _ | 7 | AuxM | +
-| 7 | يَرْتَفِعُ | اِرْتَفَع_1 | V | VI | Mood=I<nowiki>|</nowiki>Voice=A<nowiki>|</nowiki>Person=3<nowiki>|</nowiki>Gender=M<nowiki>|</nowiki>Number=S | 0 | Pred | _ | _ +
-| 8 | إِلَى إِلَى_1 P| _ | 7 | AuxP | _ | _ | +
-| 9 | 654 | [DEFAULT] | Q | Q| _ | 8 | Adv | _ | _ | +
-| 10 | مِلْيُونَ | مِلْيُون_1 | N | N- | Case=4<nowiki>|</nowiki>Defin=R | 9 | Atr | _ | _ | +
-11 نَسَمَةٍ نَسَمَة_1 N| Gender=F<nowiki>|</nowiki>Number=S<nowiki>|</nowiki>Case=2<nowiki>|</nowiki>Defin=I | 10 | Atr | _ | _ +
-12 فِي فِي_1 P| _ | 7 | AuxP | _ | _ | +
-| 13 | مُنْتَصَفِ | مُنْتَصَف_1 | N | N- | Case=2<nowiki>|</nowiki>Defin=12 Adv | +
-14 القَرْنِ قَرْن_1 | N | N- | Case=2<nowiki>|</nowiki>Defin=D | 13 | Atr | _ | _ | +
- +
-The first sentence of the CoNLL 2007 test data: +
- +
-مُقاوَمَةُ | مُقاوَمَة_1 | N | N- | Gender=F<nowiki>|</nowiki>Number=S<nowiki>|</nowiki>Case=1<nowiki>|</nowiki>Defin=R 0 | ExD | _ | _ | +
-| 2 | زَواجِ | زَواج_1 | N | N- | Case=2<nowiki>|</nowiki>Defin=R Atr _ | _ | +
-| 3 | الطُلّابِ | طالِب_1 | N | N- | Case=2<nowiki>|</nowiki>Defin=D 2 | Atr | _ | _ | +
-| 4 | العُرْفِيِّ | عُرْفِيّ_1 | A | A- | Case=2<nowiki>|</nowiki>Defin=D | 2 | Atr | _ | _ |+
  
 ==== Parsing ==== ==== Parsing ====

[ Back to the navigation ] [ Back to the content ]