[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

courses:rg:a-wordnet-based-system [2010/11/02 16:00] (current)
septina.larasati vytvořeno
Line 1: Line 1:
 +====== A WordNet-based system for multi-way classification of semantic relations ======
 +Matteo Negri and Milen Kouylekov
 +[[http://​www.aclweb.org/​anthology/​S/​S10/​S10-1044.pdf|A WordNet-based system for multi-way classification of semantic relations]] ​
  
 +
 +===== Comments =====
 +
 +  * The paper briefly describes the task (SemEval-2010 Task #8). The task detail description itself which includes the annotation description,​ dataset configuration (training, development,​ test set), and evaluation methodology are described in different paper [1].
 +
 +  * The task was to classify **semantic relation** between pairs of common nominals, which differs to **semantic roles** [1].
 +
 +  * Some terms that are mentioned in the paper are not explained in the paper but are explained in [1], for example: ​
 +“For the purpose of annotation, we define a **nominal** as a **noun** or a **base noun phrase**. A base noun phrase is a noun and its pre-modifiers (e.g., nouns, adjectives, determiners). We do not include complex noun phrases (e.g., noun phrases with attached prepositional phrases or relative clauses).” [1]
 +
 +  * **Dataset configuration (distribution)** – “Data sets. The annotated data will be divided into a training set, a development set and a test set. There will be 1000 annotated examples for each of the ten relations: 700 for training, 100 for development and 200 for testing” [1]
 +
 +  * It’s unclear how the writer represents the features, since there are only descriptions of the feature types but with no example.
 +
 +  * There were some discussions on how to build the features in the **semantic boundary collocation features type**.
 +The features are collected in bottom-up fashion. The features would be the collocations that are treated as boolean features. These collocations (provided in the WordNet) are the ancestors of the annotated nominals (i.e <e1> and <e2>) appearing in the training data for at least n times and with at most m relations times.
 +For a given sentence (instances of the evaluated data), if the annotated nominals of the sentence appears on the features (which are the collected collocations) hyponym, those features will be set to 1.
 +
 +  * It’s quite hard to understand how the Bayesian Network Classifier works in Weka for most people who haven’t tried working with Weka toolkit which is used by the writer. Here is some additional paper for Bayesian Network Classifier in Weka [2].
 +
 +  * There are descriptions and evaluation of the four runs on the **training set** and not on the **test set**. The participation on the SemEval-2010 task#8 was using the first setting (**FBK_NK_RES1**) which achieved a Macro-averaged F1 of 68.02% on **test data**.
 +  ​
 +  ​
 +===== Suggested Additional Reading =====
 +  * [1] I. Hendrickx et al. 2010. SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals Proceedings of the 5th SIGLEX Workshop on Semantic Evaluation.
 +  * [2] “Bayesian Network Classifiers in Weka”, Remco R. Bouckaert. [[http://​weka.sourceforge.net/​manuals/​weka.bn.pdf|Bayesian Network Classifiers in Weka]] ​
 +
 +
 +
 +
 +===== What do we like about the paper =====
 +  * It’s relatively easy to grasp the general idea of the techniques, although we have to refer to other paper to get more detailed technical descriptions of the task.
 +
 +===== What do we dislike about the paper =====
 +  * Since it’s a quite a short paper, there are not enough example about the features.
 +
 +
 +Written by Septina Larasati

[ Back to the navigation ] [ Back to the content ]