Differences
This shows you the differences between two versions of the page.
Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
courses:rg:2012:encouraging-consistent-translation-bushra [2012/10/23 15:33] jawaid |
courses:rg:2012:encouraging-consistent-translation-bushra [2012/10/23 15:37] jawaid |
| |
====Approach:==== | ====Approach:==== |
- The core idea of maintaining translation consistencies (TC) is implemented by intrdocuing bias towards TC in form of "consistency features". Three consistency features are used inside the decoding model and their values are estimated using 2-pass decoding scheme. | - The core idea of maintaining translation consistencies (TC) is implemented by intrdocuing bias towards TC in the form of "consistency features". Three consistency features are used inside the decoding model and their values are estimated using 2-pass decoding scheme. |
- BM25 which is used as term weighting function is a well known ranking function in the filed of information retrieval and a refined version of TF-IDF (another ranking function uses in IR). | - BM25 which is used as term weighting function is a well known ranking function in the filed of information retrieval and a refined version of TF-IDF (another ranking function uses in IR). |
- C<sub>1</sub> is a fine-grain approach of term weighting function and it is computed by counting how many times rule was applied in first-pass. This approach suffers when source and target phrase differs only in non-terminal positioning or use of determiners. | - Description of consistency features: |
- C<sub>2</sub> on other hand is a coarse-grain function which takes only target tokens into account. To us, C2 looks similar to the language model feature but trained only on the target side of the dev set. | - C<sub>1</sub> is a fine-grain approach of term weighting function and it is computed by counting how many times rule was applied in first-pass. This approach suffers when source and target phrase differs only in non-terminal positioning or use of determiners. |
- C<sub>3</sub> goes over all alignment pairs and for each rule it select those term pairs that have maximum feature vaue. | - C<sub>2</sub> on other hand is a coarse-grain function which takes only target tokens into account. To us, C<sub>2</sub> looks similar to the language model feature but trained only on the target side of the dev set. |
| - C<sub>3</sub> goes over all alignment pairs and for each rule it select those term pairs that have maximum feature vaue. |
| |
Evaluation: | Evaluation: |