[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:treebanks:fa [2012/01/29 18:19]
zeman Update. I have seen the data!
user:zeman:treebanks:fa [2012/01/29 20:35]
zeman Sample.
Line 42: Line 42:
  
 ==== Sample ==== ==== Sample ====
 +
 +The first sentence of the corpus in the CoNLL format:
 +
 +| 1 | به | به | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 2 | گزارش | گزارش | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 1 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 3 | خبرنگار | خبرنگار | N | ANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 2 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 4 | مهر | مهر | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 3 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 5 | در | در | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 3 | NPP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 6 | گرگان | گرگان | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 5 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 7 | <nowiki>،</nowiki> | <nowiki>،</nowiki> | PUNC | PUNC | <nowiki>attachment=ISO|senID=23472</nowiki> | 6 | PUNC | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 8 | بر | بر | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 9 | اساس | اساس | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 8 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 10 | باورهای | باور | N | IANM | <nowiki>attachment=ISO|number=PLUR|senID=23472</nowiki> | 9 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 11 | دینی | دینی | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 10 | NPOSTMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 12 | <nowiki>ترکمن‌ها</nowiki> | ترکمن | N | ANM | <nowiki>attachment=ISO|number=PLUR|senID=23472</nowiki> | 10 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 13 | در | در | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 14 | این | این | PREM | DEMAJ | <nowiki>attachment=ISO|senID=23472</nowiki> | 15 | NPREMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 15 | روز | روز | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 13 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 16 | برای | برای | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | NPP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 17 | پیامبر | پیامبر | N | ANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 16 | VPP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 18 | اکرم | اکرم | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 17 | NPOSTMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 19 | <nowiki>(</nowiki> | <nowiki>(</nowiki> | PUNC | PUNC | <nowiki>attachment=ISO|senID=23472</nowiki> | 20 | PUNC | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 20 | ص | ص | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 17 | APP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 21 | <nowiki>)</nowiki> | <nowiki>)</nowiki> | PUNC | PUNC | <nowiki>attachment=ISO|senID=23472</nowiki> | 20 | PUNC | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 22 | ناراحتی | ناراحتی | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 26 | SBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 23 | و | و | CONJ | CONJ | <nowiki>attachment=ISO|senID=23472</nowiki> | 22 | NCONJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 24 | بیماری | بیماری | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 23 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 25 | رخ | رخ | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 26 | NVE | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 26 | داد | <nowiki>داد#ده</nowiki> | V | ACT | <nowiki>person=3|attachment=ISO|number=SING|tma=GS|senID=23472</nowiki> | 0 | ROOT | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 27 | که | که | SUBR | SUBR | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | AJUCL | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 28 | چند | چند | PREM | AMBAJ | <nowiki>attachment=ISO|senID=23472</nowiki> | 29 | NPREMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 29 | روز | روز | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 39 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 30 | بعد | بعد | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 29 | NPOSTMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 31 | با | با | PREP | PREP | <nowiki>attachment=ISO|senID=23472</nowiki> | 39 | ADV | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 32 | رحلت | رحلت | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 31 | POSDEP | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 33 | نبی | نبی | N | ANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 32 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 34 | مکرم | مکرم | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 33 | NPOSTMOD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 35 | اسلام | اسلام | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 33 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 36 | جهان | جهان | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 39 | SBJ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 37 | عزادار | عزادار | ADJ | AJP | <nowiki>attachment=ISO|senID=23472</nowiki> | 39 | MOS | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 38 | ماتمش | ماتم | N | IANM | <nowiki>attachment=ISO|number=SING|senID=23472</nowiki> | 37 | MOZ | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 39 | شد | <nowiki>کرد#کن</nowiki> | V | PASS | <nowiki>person=3|attachment=ISO|number=SING|tma=GS|senID=23472</nowiki> | 27 | PRD | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
 +| 40 | <nowiki>.</nowiki> | <nowiki>.</nowiki> | PUNC | PUNC | <nowiki>attachment=ISO|senID=23472</nowiki> | 26 | PUNC | <nowiki>_</nowiki> | <nowiki>_</nowiki> |
  
 ==== Parsing ==== ==== Parsing ====

[ Back to the navigation ] [ Back to the content ]