[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
courses:rg:2011-report-parser [2012/09/26 13:46]
ufal vytvořeno
courses:rg:2011-report-parser [2012/09/27 10:39]
ufal
Line 3: Line 3:
 written by Stephen Tratz and Eduard Hovy (Information Sciences Institute, University of Southern Carolina) written by Stephen Tratz and Eduard Hovy (Information Sciences Institute, University of Southern Carolina)
  
-spoken by Martin Popel+presented by Martin Popel
  
 reported by Michal Novák reported by Michal Novák
  
 ===== Introduction ===== ===== Introduction =====
 +
 +The paper describes a high-quality conversion of Penn Treebank to dependency trees. The authors introduce an improved labeled dependency scheme based on the Stanford's one. In addition, they extend the non-directional easy-first first algorithm of Goldberg and Elhadad to support non-projective trees by adding "move" actions inspired by Nivre's swap-based reordering for shift-reduce parsing. Their parser is capable of producing shallow semantic annotations for prepositions, possesives and noun compounds.
  
  
 ===== Notes ===== ===== Notes =====
  
 +==== Dependency conversion structure ====
 +
 +  * in general, there are (at least) 3 possible types of dependency labels:
 +    * unlabeled - is it really a set of labels?
 +    * coarse labels of the CoNLL tasks
 +        * 10-20 labels
 +        * for example NMOD is always under a noun - it's an easy task and the result is not quite useful
 +    * their scheme is based on the Stanford's dependency labels
 +
 +==== Conversion process ====
 +
 +  * converting phrase trees of Penn Treebank to dependency ones
 +  * it consists of 3 steps:
 +      - add structure to flat NPs
 +      - constituent-to-dependency converter with some head-finding rule modifications
 +          * a list of rules in Figure 2 is hardly understandable without reading a paper their conversion method is related to
 +          * they reduced the number of generic "dep/DEP" relation
 +              * Stanford tags are hierarchical and "dep/DEP" is the top-most one
 +          * 1.3% of arcs are non-projective (out of 8.1% of all non-projective arcs) because of the following conversion (agreement can be a motivation for this, i.e. in Czech):
 +            {{:courses:rg:dependency-conversion.png|}}
 +            
 +==== Parser ====
 +
 +  * we illustrated a step of the parser:
 +{{:courses:rg:ndef_parsing.png|}}
 +  * we compared time complexity of this system with other commonly used ones

[ Back to the navigation ] [ Back to the content ]