[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

poesio_abstract [2012/11/06 16:09]
ufal vytvořeno
poesio_abstract [2012/11/06 16:10] (current)
ufal
Line 1: Line 1:
-Empirical methods in the study of anaphora: lessons learned, +**Massimo Poesio, University of Essex** 
-remaining problems +//Empirical methods in the study of anaphora: lessons learned, remaining problems//
-Massimo Poesio, University of Essex+
  
 In the last ten years we witnessed the creation of anaphorically annotated corpora [1] of substantial size (between 500,000 and 1 million tokens) and for many languages, including Arabic, Catalan, Chinese, Czech, Dutch, English, German, Italian, Japanese, and Spanish. These resources have enabled a flourishing of evaluation initiatives devoted to the cross-lingual computational study of anaphora, such as SEMEVAL-2010,​ the CONLL 2011 shared task, and now the CONLL 2012 shared task (Arabic, Chinese and English). The results obtained in such campaigns indicate, however, that there is still a way to go before this task is understood to the degree of other aspects of natural language interpretation,​ including tasks such as semantic role labelling. In this talk I will discuss the lessons learned during our experience with the annotation of the GNOME and ARRAU corpora of English, the LiveMemories corpus of Italian, and the ongoing annotation using the Phrase Detective game [2] and the issues that still remain to be tackled. In the last ten years we witnessed the creation of anaphorically annotated corpora [1] of substantial size (between 500,000 and 1 million tokens) and for many languages, including Arabic, Catalan, Chinese, Czech, Dutch, English, German, Italian, Japanese, and Spanish. These resources have enabled a flourishing of evaluation initiatives devoted to the cross-lingual computational study of anaphora, such as SEMEVAL-2010,​ the CONLL 2011 shared task, and now the CONLL 2012 shared task (Arabic, Chinese and English). The results obtained in such campaigns indicate, however, that there is still a way to go before this task is understood to the degree of other aspects of natural language interpretation,​ including tasks such as semantic role labelling. In this talk I will discuss the lessons learned during our experience with the annotation of the GNOME and ARRAU corpora of English, the LiveMemories corpus of Italian, and the ongoing annotation using the Phrase Detective game [2] and the issues that still remain to be tackled.

[ Back to the navigation ] [ Back to the content ]