[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
user:zeman:treebanks:fi [2011/12/05 13:38]
zeman vytvořeno
user:zeman:treebanks:fi [2011/12/05 14:27]
zeman Domain.
Line 20: Line 20:
  
   * Website   * Website
-    * http://vvv.cs.ut.ee/~kaili/Korpus/puud/ ([[http://translate.google.cz/translate?sl=et&tl=en&js=n&prev=_t&hl=cs&ie=UTF-8&layout=2&eotf=1&u=http%3A%2F%2Fvvv.cs.ut.ee%2F~kaili%2FKorpus%2Fpuud%2F&act=url|Google translate]])+    * http://bionlp.utu.fi/fintreebank.html
   * Data   * Data
     * //no separate citation//     * //no separate citation//
   * Principal publications   * Principal publications
-    * Kaili MüürisepTiina PuolakainenKadri MuischnekMare KoitTiit Roosmaa, Heli Uibo: [[https://nats-www.informatik.uni-hamburg.de/intern/proceedings/2003/RANLP/papers/p16.pdf|A New Language for Constraint GrammarEstonian]]. In: International Conference Recent Advances in Natural Language Processing. Proceedings, pp. 304-310, BorovetsBulgaria2003.+    * Katri HaverinenFilip GinterVeronika LaippalaTimo ViljanenTapio Salakoski: [[http://bionlp.utu.fi/sites/default/files/haverinen-et-al-2009.pdf|Dependency Annotation of WikipediaFirst Steps Towards a Finnish Treebank]]. In: Proceedings of The Eighth International Workshop on Treebanks and Linguistic Theories (TLT8)Milano, Italy, 2009. 
 +    * Katri Haverinen, Timo Viljanen, Veronika Laippala, Samuel Kohonen, Filip Ginter, Tapio Salakoski: [[http://dspace.utlib.ee/dspace/handle/10062/15936|Treebanking Finnish]]. In: Proceedings of The Ninth International Workshop on Treebanks and Linguistic Theories (TLT9), pp. 79-90. TartuEstonia2010.
   * Documentation   * Documentation
-    * [[http://beta.visl.sdu.dk/treebanks.html#The_source_format|File formats]] +    * The file FILE-FORMAT.txt in the distribution 
-    * The header of the TIGER-XML version of the treebank contains lists of various sorts of tags with brief explanation.+    * [[http://www2.lingsoft.fi/doc/fintwol/intro/tags.html|Partial list of part-of-speech tags with descriptions]] (POS tagging has been done by www.lingsoft.fi)
  
 ==== Domain ==== ==== Domain ====
  
-Mixed+Mixed (Wikipedia, Wikinews, university web-magazine and blogs).
-  * 388 tailored sentences with movement verbs +
-  * 732 sentences with movement verbs from the Estonian FrameNet corpus +
-  * 175 sentences from the Arborest corpus +
-  * 20 sentences of spoken language+
  
 ==== Size ==== ==== Size ====

[ Back to the navigation ] [ Back to the content ]