Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Last revision Both sides next revision | ||
courses:rg:2012:sigtest-mt-zilka [2012/11/14 18:02] zilka |
courses:rg:2012:sigtest-mt-zilka [2012/11/18 19:45] popel |
||
---|---|---|---|
Line 34: | Line 34: | ||
====== Presentation ====== | ====== Presentation ====== | ||
* We answered: | * We answered: | ||
- | * Question 1 - BLEU scores are: 1 - 1.0, 2 - 0.0 (or some smoothed value), 3 - 0.2 | + | * Question 1 - BLEU scores are: 1 - 1.0, 2 - not defined (0.0 or some smoothed value in practice), 3 - 0.2 (based on the incorrect formula in the paper which is missing 1/4) |
* Question 2 - broad sampling, samples far apart distributed -> {data_1, data_101, data_201, ...} | * Question 2 - broad sampling, samples far apart distributed -> {data_1, data_101, data_201, ...} | ||
Line 43: | Line 43: | ||
* non-consecutive samples (broad apart) - for each of the sets BLEU varies much less - +-1.5 % | * non-consecutive samples (broad apart) - for each of the sets BLEU varies much less - +-1.5 % | ||
* they make an assumption and claim that there is no difference between comparing output of 2 different MT systems and output of 1 MT systems that is trained just with different data | * they make an assumption and claim that there is no difference between comparing output of 2 different MT systems and output of 1 MT systems that is trained just with different data | ||
- | * Lukas Zilka complained about this assumption - they should have conducted some experiments to support their claim, as there is nothing that suggest | + | * Lukas Zilka complained about this assumption - they should have conducted some experiments to support their claim, as there is nothing that suggests |
===== Section 4, 5 ===== | ===== Section 4, 5 ===== | ||
Line 56: | Line 56: | ||
===== Martin' | ===== Martin' | ||
- | * two philosophical views of p-value - Fisher' | + | * two philosophical views of p-value - Fisher' |
- | * we always | + | * we usually |
* p-value = | * p-value = | ||
* P(T(X)> | * P(T(X)> | ||
- | * unfortunately we tend to view the p-value as P(H0|x) which it is not and we need to apply the Bayes's theorem to get it | + | * unfortunately we tend to view the p-value as P(H0|x) which it is not and we need to apply the Bayes' theorem to get it |
- | * bootstrap resampling can be viewed as p-value=P(d(x) > d(x_orig)|H0), | + | * bootstrap resampling can be viewed as p-value |