[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:rg:2012:encouraging-consistent-translation [2012/10/16 15:15]
dusek
courses:rg:2012:encouraging-consistent-translation [2012/10/17 11:40]
dusek
Line 37: Line 37:
       * Beware, this notion of grouping is not well-defined, does not create equivalence classes: "old hostages" = "new hostages" = "completely new hostages" but "old hostages" != "completely new hostages" (we hope this didn't actually happen)       * Beware, this notion of grouping is not well-defined, does not create equivalence classes: "old hostages" = "new hostages" = "completely new hostages" but "old hostages" != "completely new hostages" (we hope this didn't actually happen)
     * Cases where //only one translation variant prevails// are //discarded// (this is the case of "Korea")     * Cases where //only one translation variant prevails// are //discarded// (this is the case of "Korea")
 +
 +==== Sec. 4. Approach ====
 +
 +The actual experiments begin only now; the used data is different.
 +
 +**Choice of features**
 +  * They define 3 features that are designed to be biased towrds consistency -- or are they?
 +    * If e.g. two variants are used 2 times each, they will have roughly the same score
 +  * The new features need two passes through the data
 +  * The BM25 function is a refined version of a [[http://en.wikipedia.org/wiki/TF-IDF|TF-IDF]] score
 +  * The exact parameter values are probably not tuned, left at a default value (and maybe they don't have much influence anyway)
 +   * See NPFL103 for details on Information retrieval, it's largely black magic
 +
 +**Feature weights**
 +  * The usual model in MT is scoring the hypotheses according to the feature values (''f'') and their weights (''lambda''): 
 +    * ''score(H) = exp( sum( lambda_i * f_i(H)) )''
 +  * The feature weights are trained on a heldout data set using [[http://acl.ldc.upenn.edu/acl2003/main/pdfs/Och.pdf|MERT]] (or, here: [[http://en.wikipedia.org/wiki/Margin_Infused_Relaxed_Algorithm|MIRA]])
 +  * The resulting weights are not mentioned, but if the weight is < 0, will this favor different translation choices?
 +
 +**Differences in features**
 +  * C1 indicates that a certain Hiero rule was used frequently
 +    * but rules are very similar, so we also need something less fine-grained
 +  * C2 is a target-side feature, just counts the target side tokens
 +    * It may be compared to Language Model features, but is trained only on the target part of the bilingual training data.
 +

[ Back to the navigation ] [ Back to the content ]