[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
user:zeman:treebanks:de [2011/11/20 19:42]
zeman vytvořeno
user:zeman:treebanks:de [2014/10/08 21:16]
zeman Fixed link.
Line 1: Line 1:
 ===== German (de) ===== ===== German (de) =====
  
-[[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/|TIGER Treebank]]+[[http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger.html|TIGER Treebank]]
  
 ==== Versions ==== ==== Versions ====
Line 7: Line 7:
   * TIGER Treebank 1 (2003)   * TIGER Treebank 1 (2003)
   * TIGER Treebank 2 (2005)   * TIGER Treebank 2 (2005)
-  * TIGER Treebank 2.1 (2007) in [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/doc/html/TigerXML.html|TIGER-XML]] or Negra export (text) format+  * TIGER Treebank 2.1 (2007) in [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/TIGERSearch/doc/html/TigerXML.html|TIGER-XML]] or Negra export (text) format
   * CoNLL 2006   * CoNLL 2006
   * CoNLL 2009   * CoNLL 2009
Line 13: Line 13:
 ==== Obtaining and License ==== ==== Obtaining and License ====
  
-The TIGER Treebank is freely downloadable after you accept the [[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/license/htmllicense.shtml|license terms]] by pressing a button.+The TIGER Treebank is freely downloadable after you accept the [[http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html|license terms]] by pressing a button.
  
 Republication of the two CoNLL versions in LDC is planned but it has not happenned yet. Republication of the two CoNLL versions in LDC is planned but it has not happenned yet.
Line 31: Line 31:
  
   * Website   * Website
-    * http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERCorpus/+    * http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger.html
   * Data   * Data
     * //no separate citation//     * //no separate citation//
Line 61: Line 61:
 It is not clear what the //semi-automatic// annotation means (probably first auto-tagging, then manual correction?) and whether it also applies to the morphosyntactic annotation. The CoNLL 2009 version also contains automatically disambiguated lemmas, tags and features. It is not clear what the //semi-automatic// annotation means (probably first auto-tagging, then manual correction?) and whether it also applies to the morphosyntactic annotation. The CoNLL 2009 version also contains automatically disambiguated lemmas, tags and features.
  
-The original treebank is phrase-based. The dependencies in the CoNLL versions must have thus been drawn using a head-selection procedure. Besides CoNLL data, the TIGER project also provides a subset of the TIGER Treebank in a dependency format.+The original treebank is phrase-based. The dependencies in the CoNLL versions must have thus been drawn using a head-selection procedure. Besides CoNLL data, the TIGER project also provides a subset of the TIGER Treebank in a dependency format. (Note that it is possible in the TIGER-XML format to mark the head of each phrase using a particular edge label, e.g. ''HD''. However, it is not guaranteed that every phrase in the TIGER Treebank contains just one head constituent, see the sample below.)
  
 ==== Sample ==== ==== Sample ====

[ Back to the navigation ] [ Back to the content ]