Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
courses:rg:2012:encouraging-consistent-translation-bushra [2012/10/23 15:28] jawaid |
courses:rg:2012:encouraging-consistent-translation-bushra [2012/10/23 20:14] popel typos and 1 comment |
||
---|---|---|---|
Line 3: | Line 3: | ||
====Related Work:==== | ====Related Work:==== | ||
- | - Paper on the similar approach by Carput (2009) has found to be different in comparison to this work. They have used "one translation per discourse" | + | |
- | + | - Without giving any proper evidence, authors have speculated that modeling " | |
- | - Without giving any proper evidence, authors have speculated that modeling " | + | |
====Analysis: | ====Analysis: | ||
- | * Forced Decoding is the decoding method in which for a given pair of source and target sentences, decoder searches for the translation rules that fit the target sentence for a given source sentence. | + | - Forced Decoding is the decoding method in which for a given pair of source and target sentences, decoder searches for the translation rules that fit the target sentence for a given source sentence. |
+ | - Term " | ||
+ | - After selecting sample cases, few filtering techniques have been applied to discard the irrelevant samples. Filtering steps are well documented in the paper on page 419 in first paragraph of 2nd column. | ||
- | 2- Term " | + | ====Approach:==== |
- | + | - The core idea of maintaining translation consistencies (TC) is implemented by introducing | |
- | 3- After selecting sample cases, few filtering techniques have applied to discard the irrelevant samples. Filtering steps are well documented in the paper on page 419 in first paragraph of 2nd column. | + | - BM25 which is used as term weighting function is a well known ranking function in the filed of information retrieval and a refined version of TF-IDF (another ranking function |
- | + | - Description of consistency features: | |
- | Approach: | + | - C< |
- | + | - C< | |
- | 1- The core idea of maintaining translation consistencies (TC) is implemented by intrdocuing | + | - C< |
- | + | ||
- | 2- BM25 which is used as term weighting function is a well known ranking function in the filed of information retrieval and a refined version of TF-IDF (another ranking function | + | |
- | + | ||
- | 3- C1 is a fine-grain approach of term weighting function and it is computed by counting how many times rule was applied in first-pass. This approach suffers when source and target phrase differs only in non-terminal positioning or use of determiners. | + | |
- | + | ||
- | 4- C2 on other hand is a coarse-grain function which takes only target tokens into account. To us, C2 looks similar to the language model feature but trained only on the target side of the dev set. | + | |
- | + | ||
- | 5- C3 goes over all alignment pairs and for each rule it select | + | |
Evaluation: | Evaluation: | ||
- | 1- Cdec's implementation of Hierarchical MT is used in this work. As we know, hierarchical decoding is also implemented in other MT systems such as Moses, Joshua etc. The selection of cdec over other MT systems is authors' | + | |
- | + | - MIRA is used for tuning feature weights. | |
- | 2- MIRA is used to train the MT system. | + | - Authors don't tune decoder in first-pass i.e. they don' |
- | + | - NIST-BLEU | |
- | 3- Authors don't tune decoder in first-pass i.e. they don' | + | - They gain maximum of 1.0 point increase in BLEU after combining all three features. |
- | + | - Authors called BLEU as a " | |
- | 4- NIST-BLEU is used to compare results with official NIST evaluation whereas IBM-BLEU is used for evaluating the rest of experiments. We don't fully understand the use of different BLEU (prefering shorter sentences incase of NIST and longer incase of IBM) for evaluation and not sticking with NIST-BLEU | + | - They could have supported their argument by manually evaluating the test set. |
- | + | - Instead of wasting half of the page length by criticizing over BLEU, they could have evaluated their system on other metric such as METEOR. | |
- | 5- They gain maximum of 1.0 point increase in BLEU after combining all three features. | + | - We believe that significance testing |
- | + | ||
- | 6- Authors called BLEU as a " | + | |
- | + | ||
- | i- They could have supported their argument by manually evaluating the test set. | + | |
- | ii- Instead of wasting half of the page length by criticizing over BLEU, they could have evaluated their system on other metric such as METEOR. | + | |
- | + | ||
- | 7- Also, significance testing | + | |
- | Conclusion: | + | ====Conclusion:==== |
+ | Paper is nicely written and all experiments are well documented. We believe that consistent translation choices system is well suited only for translating from direction of morphologically-rich to morphologically-low language pairs and not the other way round. For translating in direction of morphologically rich languages, this approach can make serious errors by putting different morphological forms of the words, bearing different meanings, under the consistent translations. | ||
- | Paper is nicely written and all experiments are well documented. We believe that consistent translation choices system is better suited only for translating from direction of morphologically-rich to morphologically-low language pairs but in case of reverse direction this approach can make serious errors by putting different morhological forms of the words bearing different meanings under the conistent translations. |