[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:rg:2011-report-parser [2012/09/27 10:36]
ufal
courses:rg:2011-report-parser [2012/09/27 11:22] (current)
ufal
Line 1: Line 1:
-====== Using a Wikipedia-based Semantic Relatedness Measure for Document Clustering ======+====== A Fast, Accurate, Non-Projective, Semantically-Enriched Parser ======
  
 written by Stephen Tratz and Eduard Hovy (Information Sciences Institute, University of Southern Carolina) written by Stephen Tratz and Eduard Hovy (Information Sciences Institute, University of Southern Carolina)
Line 32: Line 32:
           * they reduced the number of generic "dep/DEP" relation           * they reduced the number of generic "dep/DEP" relation
               * Stanford tags are hierarchical and "dep/DEP" is the top-most one               * Stanford tags are hierarchical and "dep/DEP" is the top-most one
-          * 1.3% of arcs are non-projective (out of 8.1% of all non-projective arcs) because of the following conversion (agreement can be a motivation for this, i.e. in Czech): +          * correcting of POS using the syntactic info + additional rules for specific word forms 
-            {{:courses:rg:dependency-conversion.png|}} +          * 1.3% of arcs are non-projective (out of 8.1% of all non-projective arcs) because of the following conversion (agreement can be a motivation for this, i.e. in Czech): {{:courses:rg:dependency-conversion.png|}} 
-            +      - additional changes and the final conversion from the intermediate output to a dependency structure 
 ==== Parser ==== ==== Parser ====
  
-  * we illustrated a step of the parser+  * we illustrated a step of the parser
 +{{:courses:rg:ndef_parsing.png|}} 
 +  * we compared time complexity of this system with other commonly used ones 
 + 
 +| MST parser | <latex>\mathop O(n^2)</latex> | | 
 +| MALT parser | <latex>\mathop O(n)</latex> | in fact slower | 
 +| this parser | <latex>\mathop O(n\log(n))</latex> | <latex>\mathop O(n^2)</latex> - naive implementation | 
 +| this parser - non-projective | <latex>\mathop O(n^2\log(n))</latex> | <latex>\mathop O(n^3)</latex> - naive implementation | 
 + 
 +    * implemented by a heap, it can reach <latex>\mathop O(n\log(n))</latex>; however, they don't use it because it's fast enough 
 +    * Algorithm1 
 +        * we weren't sure what the variable "index" is (best index of parent or any index of queue of unprocessed words) 
 +        * it again confirms that pseudocode is usually more confusing than a normal code or verbal explanation 
 +    * "move" operation: 
 +{{:courses:rg:move_operation.png|}} 
 + 
 +==== Features ==== 
 +  * Brown et al. clusters - they are surprisingly used rarely 
 +==== Features ==== 
 +  * not much discussed in the paper 
 +==== Evaluation ==== 
 +  * they make the same transformation as they did in Section 2 
 +==== Shallow semantic annotation ==== 
 +  * 4 optional modules 
 +      * preposition sense disambiguation 
 +      * noun compound interpretation 
 +      * possesives interpretation 
 +      * PropBank semantic role labelling 
 +          * (Hajič et al., 2009) is not in the list of references 
 +===== Conclusion ===== 
 +  * non-directional easy-first parsing 
 +  * new features - Brown et al. clusters 
 +  * fast and accurate 
 +  * modified Penn converter 
 +      * changes in 9.500 POS tags 
 +      * labels copula, coordination 
 +  * semantic info

[ Back to the navigation ] [ Back to the content ]