Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:rg:2012:encouraging-consistent-translation [2012/10/17 11:43] dusek |
courses:rg:2012:encouraging-consistent-translation [2012/10/17 11:59] dusek |
||
---|---|---|---|
Line 8: | Line 8: | ||
The list of discussed topics follows the outline of the paper: | The list of discussed topics follows the outline of the paper: | ||
==== Sec. 2. Related Work ==== | ==== Sec. 2. Related Work ==== | ||
- | |||
**Differences from Carpuat 2009** | **Differences from Carpuat 2009** | ||
* It is different: the decoder just gets additional features, but the decision is up to it -- Carpuat 2009 just post-edits the outputs and substitutes the most likely variant everywhere | * It is different: the decoder just gets additional features, but the decision is up to it -- Carpuat 2009 just post-edits the outputs and substitutes the most likely variant everywhere | ||
Line 17: | Line 16: | ||
* The authors do not state their evidence clearly. | * The authors do not state their evidence clearly. | ||
* One sense is not the same as one translation | * One sense is not the same as one translation | ||
- | ==== Sec. 3. Exploratory analysis ==== | ||
+ | ==== Sec. 3. Exploratory analysis ==== | ||
**Hiero** | **Hiero** | ||
* The idea would most probably work the same in normal phrase-based SMT, but the authors use hierarchical phrase-based translation (Hiero) | * The idea would most probably work the same in normal phrase-based SMT, but the authors use hierarchical phrase-based translation (Hiero) | ||
Line 39: | Line 38: | ||
==== Sec. 4. Approach ==== | ==== Sec. 4. Approach ==== | ||
- | |||
The actual experiments begin only now; the used data is different. | The actual experiments begin only now; the used data is different. | ||
Line 59: | Line 57: | ||
* but rules are very similar, so we also need something less fine-grained | * but rules are very similar, so we also need something less fine-grained | ||
* C2 is a target-side feature, just counts the target side tokens (only the "most important" | * C2 is a target-side feature, just counts the target side tokens (only the "most important" | ||
- | * It may be compared to Language Model features, but is trained only on the target part of the bilingual | + | * It may be compared to Language Model features, but is trained only on the target part of the bilingual |
* C3 counts occurrences of source-target token pairs (and uses the "most important" | * C3 counts occurrences of source-target token pairs (and uses the "most important" | ||
Line 65: | Line 63: | ||
* They need two passes through the data | * They need two passes through the data | ||
* You need to have document segmentation | * You need to have document segmentation | ||
- | * Since the frequencies are trained on the training | + | * Since the frequencies are trained on the tuning |
+ | |||
+ | ==== Sec. 5. Evaluation and Discussion ==== | ||
+ | **Choice of baseline** | ||
+ | * Baselines are quite nice and competitive, | ||
+ | * MIRA is very cutting-edge | ||
+ | |||
+ | **Tuning the feature weights** | ||
+ | * For the 1st phase, " | ||
+ | * This is in order to speed up the experiment, they don't want to wait for MIRA twice. | ||
+ | |||
+ | **Different evaluation metrics** | ||
+ | * The BLEU variants do not differ that much, only in Brevity Penalty for multiple references | ||
+ | * IBM BLEU uses the reference that is closest to the MT output (in terms of length), NIST BLEU uses the shortest one | ||
+ | * This was probably just due to some technical reasons, e.g. they had their optimization software designed for one metric and not the other |