This is an old revision of the document!
Statistical Significance Tests for Machine Translation Evaluation
Koehn, EMNLP 2004, link
Questions
1) BLEU_MT1 = 1, BLEU_MT2 = 0 (or undefined)
BLEU_MT3 = 0.2 (according to the formula in the paper, incorrect)
It should be exp(1/4(ln(4/5) + ln(3/4) + ln(2/3) + ln(1/2))) = 0.668