12. 11. 2010 (MFF UK, Malostranske nam. 25, 4. patro, mistnost S1) SHACHAR MIRKIN (Bar-Ilan University, Israel) INCORPORATING DISCOURSE INFORMATION WITHIN TEXTUAL ENTAILMENT INFERENCE Abstract: Texts are commonly interpreted based on the entire discourse in which they are situated. Discourse processing has been shown useful for inference-based NLP applications; yet, most systems for textual entailment? a generic paradigm for applied semantic inference? have only addressed discourse considerations via off-the-shelf coreference resolvers and in restricted manners. In this talk we explore various aspects of discourse information in entailment inference, suggest directions for addressing them and investigate their impact on entailment performance. In particular, we investigate the impact of discourse references, notably coreference and bridging, on textual entailment inference. On the basis of an in-depth analysis of entailment instances, we argue that beyond the standard nominal coreference substitution, other types of operations and other types of discourse relations should be considered in the inference process. We suggest a set of operations that can be incorporated into inference systems to enable the combination of information that is scattered among multiple locations in the text. Among these, the merge operation combines pieces of information from multiple coreference relations mutually needed for inference, and insertion accounts for bridging relations. We identify several additional discourse aspects that affect entailment in a discourse-dependent setting, and suggest methods to practically address them within the entailment process. For instance, to partially address the limited coverage of coreference resolution tools, we extend the set of coreference relations to phrase pairs with a certain degree of lexical overlap, as long as no semantic incompatibility is found between them; to address coherence-related discourse phenomena, such as the tendency of entailing sentences to be adjacent to one another, we apply a two-phase classification scheme, where a second phase meta-classifier is applied, extracting discourse and document-level features based on the classification of each sentence on its own. Based on our experimental results we suggest that even when simple solutions are employed, the reliance on discourse-based information is helpful and achieves a significant improvement of entailment recognition results, stressing the importance of using such information by future entailment systems. Further, our findings suggest challenges for reference resolution algorithms, in order to make them more useful for inference systems. References: Shachar Mirkin, Ido Dagan and Sebastian Padó. 2010. Assessing the Role of Discourse References in Entailment Inference . In Proceedings of ACL.