[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:rg:2011:bleu-a-method-for-automatic-evaluation-of-machine-translation [2011/12/06 11:23]
galuscakova
courses:rg:2011:bleu-a-method-for-automatic-evaluation-of-machine-translation [2011/12/06 11:31] (current)
galuscakova
Line 13: Line 13:
 ===== Notes ===== ===== Notes =====
  
-BLEU score is based on the comparison of the automatic (candidate) translation and reference human translations. Basically, counts of the n-grams shared in automatic translation and reference translation are calculated and divided by number of all n-grams. This n-gram precision is further modified. If the number of particular shared n-gram is higher in the candidate translation than in the reference translation, then this count is replaced by the maximum count of this n-gram in reference translation. The BLUE score is then calculated as a linear average of these modified precisions. The brevity penalty is added to the sum to penalize shorter translations than the reference translations. +BLEU score is based on the comparison of the automatic (candidate) translation and reference human translations. Basically, counts of the n-grams shared in automatic translation and reference translation are calculated and divided by number of all n-grams. This n-gram precision is further modified. If the number of particular shared n-gram is higher in the candidate translation than in the reference translation, then this count is replaced by the maximum count of this n-gram in reference translation. The BLUE score is then calculated as an arithmetical average of logarithms of modified precisions. The brevity penalty is added to the sum to penalize shorter translations than the reference translations.
-> No, it's not a "//linear average of these modified precisions//" it's an "arithmetical average of **logarithms** of modified precisions", in other words it is a "**geometric** average of modified precisions". See Section 2.1.3.  --- Martin Popel+
  
-Jindřich noticed a mistake in section 2 where is written that the phrase "of the party" is shared only with Reference 2, but it is shared also with Reference 3. +Petr noticed a mistake in section 2 where is written that the phrase "of the party" is shared only with Reference 2, but it is shared also with Reference 3. 
  
 Another problem, that was discussed, was found in section 2.2.2. For example if we have three reference translations with lengths 12, 15 and 17 words and our translation has length 14 words. Then, according to the article, our translation is punished, because the closest sentence has length 15, despite the fact, that there also exits shorter reference translation. This was a bit suspicious. Another problem, that was discussed, was found in section 2.2.2. For example if we have three reference translations with lengths 12, 15 and 17 words and our translation has length 14 words. Then, according to the article, our translation is punished, because the closest sentence has length 15, despite the fact, that there also exits shorter reference translation. This was a bit suspicious.

[ Back to the navigation ] [ Back to the content ]