[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Table of Contents

MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility via semantic frames

Chi-kiu Lo and Dekai Wu
ACL 2011
http://www.aclweb.org/anthology/P11-1023

Presented by Petr Jankovský
Report by Rudolf Rosa

The paper was widely discussed throughout the whole session. The report is mainly chronological, and is approximately divided in correspondence to the sections of the paper.

1 Introduction

The paper proposes a semi-automatic translation evaluation metric that is claimed to be both well correlated with human judgement (especially in comparison to BLEU) and less labour-intensive than HTER (which is claimed to be much more expensive).

Meant assumes that a good traslation is one where the reader understands correctly “Who did what to whom, when, where and why” - which, as Martin noted, is rather adequacy than fluency, and therefore a comparison with BLEU, which is more fluency-oriented, is not completely fair. Moreover, good systems usually make more errors in adequacy than in fluency, which makes BLEU an even worse metric these days.

Matin further explained that HTER is a metric where the humans post-edit the MT output to transform it into a correct translation, and then TER, which is actually a word-based Levenshtein distance, is computed as the score.
Matěj Korvas then pointed to an important difference between MEANT and HTER: MEANT uses reference translations, whereas HTER uses post-editations. Surprisingly, this is not noted in the paper.

The group then discussed whether HMEANT evaluations are really faster than HTER annotations, as some of the readers participated in HMENAT evaluation. Some readers agree that about 5 minutes per sentence is quite accurate, while others state that 5 minutes are at best the lower bound. It should also be noted that there are actually three parts of the annotation - role labelling, frames aligning and accuracy evaluation (full/partial/none), as it is not completely clear whether all of these three parts are claimed to be done in 5 minutes.

Section 2 Related work was skipped.

3 MEANT: SRL for MT evaluation


[ Back to the navigation ] [ Back to the content ]