12. 11. 2010 (MFF UK, Malostranske nam. 25, 4. patro, mistnost S1)
SHACHAR MIRKIN (Bar-Ilan University, Israel)
INCORPORATING DISCOURSE INFORMATION WITHIN TEXTUAL ENTAILMENT INFERENCE
Abstract:
Texts are commonly interpreted based on the entire discourse in which they
are situated. Discourse processing has been shown useful for
inference-based NLP applications; yet, most systems for textual
entailment? a generic paradigm for applied semantic inference? have only
addressed discourse considerations via off-the-shelf coreference resolvers
and in restricted manners. In this talk we explore various aspects of
discourse information in entailment inference, suggest directions for
addressing them and investigate their impact on entailment performance.
In particular, we investigate the impact of discourse references, notably
coreference and bridging, on textual entailment inference. On the basis of
an in-depth analysis of entailment instances, we argue that beyond the
standard nominal coreference substitution, other types of operations and
other types of discourse relations should be considered in the inference
process. We suggest a set of operations that can be incorporated into
inference systems to enable the combination of information that is
scattered among multiple locations in the text. Among these, the merge
operation combines pieces of information from multiple coreference
relations mutually needed for inference, and insertion accounts for
bridging relations.
We identify several additional discourse aspects that affect entailment in
a discourse-dependent setting, and suggest methods to practically address
them within the entailment process. For instance, to partially address the
limited coverage of coreference resolution tools, we extend the set of
coreference relations to phrase pairs with a certain degree of lexical
overlap, as long as no semantic incompatibility is found between them; to
address coherence-related discourse phenomena, such as the tendency of
entailing sentences to be adjacent to one another, we apply a two-phase
classification scheme, where a second phase meta-classifier is applied,
extracting discourse and document-level features based on the
classification of each sentence on its own.
Based on our experimental results we suggest that even when simple
solutions are employed, the reliance on discourse-based information is
helpful and achieves a significant improvement of entailment recognition
results, stressing the importance of using such information by future
entailment systems. Further, our findings suggest challenges for reference
resolution algorithms, in order to make them more useful for inference
systems.
References:
Shachar Mirkin, Ido Dagan and Sebastian Padó. 2010. Assessing the Role of
Discourse References in Entailment Inference
<http://www.cs.biu.ac.il/~mirkins/publications/Mirkin-dagan-pado_ACL2010.pdf>.
In Proceedings of ACL.