[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
user:zeman:treebanks:tr [2013/06/17 16:10]
zeman
user:zeman:treebanks:tr [2013/06/18 14:51]
zeman Uploaded ttbankkl.pdf.
Line 31: Line 31:
     * Nart B. Atalay, Kemal Oflazer, Bilge Say: [[http://aclweb.org/anthology-new/W/W03/W03-2405.pdf|The Annotation Process in the Turkish Treebank]]. In: Proceedings of the EACL Workshop on Linguistically Interpreted Corpora – LINC. Budapest, Hungary, 2003.     * Nart B. Atalay, Kemal Oflazer, Bilge Say: [[http://aclweb.org/anthology-new/W/W03/W03-2405.pdf|The Annotation Process in the Turkish Treebank]]. In: Proceedings of the EACL Workshop on Linguistically Interpreted Corpora – LINC. Budapest, Hungary, 2003.
   * Documentation   * Documentation
-    * Three PDF files are attached to the CoNLL version in the ''doc'' folder: ttbankkl.pdf (the chapter from Anne Abeillé, contains list of morphological tags), turkishtreebank.pdf (the paper from the EACL workshop) and user_guide.pdf (annotation manual for dependencies, in Turkish).+    * Three PDF files are attached to the CoNLL version in the ''doc'' folder: {{:user:zeman:treebanks:ttbankkl.pdf|ttbankkl.pdf}} (the chapter from Anne Abeillé, contains list of morphological tags), turkishtreebank.pdf (the paper from the EACL workshop) and user_guide.pdf (annotation manual for dependencies, in Turkish).
  
 ==== Domain ==== ==== Domain ====
Line 48: Line 48:
  
 There are special derivational nodes. Derived words have been split into several tokens (see also the sample below). Typical pattern (maybe the only pattern but I have not confirmed that) is as follows: There are two nodes connected with a dependency link. The head node corresponds to the surface word. It has the word form, part of speech and morphological features but it has no lemma (lemma is '_'). The surface word is a result of a derivational morphological process. It has been derived from another word, often a different part of speech (e.g. a noun was derived from a verb). The dependent node represents the source of the derivation. It has no word form but it has a lemma. Its part-of-speech tag describes the source word and thus it can differ from the part-of-speech tag of the head node. The FEAT column says just 'Pos'. The dependent node need not be a leave. Other nodes may depend on it, instead of depending on the parent node. If we have a noun derived from a verb, i.e. we have a verbal node depending on the nominal node, and there is a dependent filling a verbal valency slot of the derived noun, we can expect the dependent to be attached to the verbal node. There are special derivational nodes. Derived words have been split into several tokens (see also the sample below). Typical pattern (maybe the only pattern but I have not confirmed that) is as follows: There are two nodes connected with a dependency link. The head node corresponds to the surface word. It has the word form, part of speech and morphological features but it has no lemma (lemma is '_'). The surface word is a result of a derivational morphological process. It has been derived from another word, often a different part of speech (e.g. a noun was derived from a verb). The dependent node represents the source of the derivation. It has no word form but it has a lemma. Its part-of-speech tag describes the source word and thus it can differ from the part-of-speech tag of the head node. The FEAT column says just 'Pos'. The dependent node need not be a leave. Other nodes may depend on it, instead of depending on the parent node. If we have a noun derived from a verb, i.e. we have a verbal node depending on the nominal node, and there is a dependent filling a verbal valency slot of the derived noun, we can expect the dependent to be attached to the verbal node.
 +
 +Occasionally there are derivational chains longer than two nodes. An example is in the sentence No. 82 of the test data:
 +lemma azal / Verb -> _ / Verb / Caus -> _ / Verb / Pass|Pos -> azaltılması / Noun / NInf / A3sg|P3sg|Nom
 +According to Google Translate, //azal// means “to decrease” and //azaltılması// means “reduced”. TRmorph gives the following four analyses:
 +<code>
 +analyze> azaltılması
 +azal<v><caus><pass><vn_ma><p3s>
 +azal<v><caus><pass><vn_ma><p3s><3s>
 +azal<v><caus><pass><vn_ma><p3s><3p>
 +azal<v><caus><pass><cv_ma><p3s>
 +</code>
  
 ==== Sample ==== ==== Sample ====

[ Back to the navigation ] [ Back to the content ]