Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
courses:rg:2011:bleu-a-method-for-automatic-evaluation-of-machine-translation [2011/12/06 10:35] popel comment and typos |
courses:rg:2011:bleu-a-method-for-automatic-evaluation-of-machine-translation [2011/12/06 11:23] galuscakova |
||
---|---|---|---|
Line 3: | Line 3: | ||
written by Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu (IBM T. J. Watson Research Center) | written by Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu (IBM T. J. Watson Research Center) | ||
- | spoken by Jindřich Libovický | + | spoken by Petr Jankovský |
reported by Petra Galuščáková | reported by Petra Galuščáková | ||
Line 19: | Line 19: | ||
Another problem, that was discussed, was found in section 2.2.2. For example if we have three reference translations with lengths 12, 15 and 17 words and our translation has length 14 words. Then, according to the article, our translation is punished, because the closest sentence has length 15, despite the fact, that there also exits shorter reference translation. This was a bit suspicious. | Another problem, that was discussed, was found in section 2.2.2. For example if we have three reference translations with lengths 12, 15 and 17 words and our translation has length 14 words. Then, according to the article, our translation is punished, because the closest sentence has length 15, despite the fact, that there also exits shorter reference translation. This was a bit suspicious. | ||
+ | > Yes. It's suspicious, but that is the official definition of BLEU. The kind-of official implementation [[ftp:// | ||
Performed experiments show high correlation of manual ranking and automatic ranking of translation systems. BLEU is able to distinguish between good and bad translations and between translation created by human or by automatic system. | Performed experiments show high correlation of manual ranking and automatic ranking of translation systems. BLEU is able to distinguish between good and bad translations and between translation created by human or by automatic system. |