Differences

This shows you the differences between two versions of the page.

--- courses:rg:2012:encouraging-consistent-translation [2012/10/16 15:07]
dusek
+++ courses:rg:2012:encouraging-consistent-translation [2012/10/16 15:15]
dusek
@@ Line 24: / Line 24: @@
   * The authors chose the ''cdec'' implementation of Hiero (which is implemented in several systems: Moses, cdec, Joshua etc.)
     * The choice was probably arbitrary, other systems would yield similar results
 **Forced decoding**
   * This means that the decoder is given source //and// target sentence and has to provide the rules/phrases that map from the source to the target
@@ Line 30: / Line 31: @@
     * Forced decoding is much more informative for Hiero translations than for "plain" phrase-based ones, since there are many different parse trees that yield the same target string, and not as much phrases
+**The choice and filtering of "cases"**
+  * The "cases" in Table 1 are selected according to the //possibility// of different translations (i.e. each case has at least two translations of the source seen in the training data; the translation counts are from the test data, so it is OK that e.g. "Korea" translates as "Korea" all the time)
+  * Table 1 is unfiltered -- only some of the "cases" are then considered relevant:
+    * Cases that are //too similar// (less than 1/2 characters differ) are //joined together//
+      * Beware, this notion of grouping is not well-defined, does not create equivalence classes: "old hostages" = "new hostages" = "completely new hostages" but "old hostages" != "completely new hostages" (we hope this didn't actually happen)
+    * Cases where //only one translation variant prevails// are //discarded// (this is the case of "Korea")

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences