[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:rg:2012:encouraging-consistent-translation [2012/10/17 11:44]
dusek
courses:rg:2012:encouraging-consistent-translation [2012/10/17 11:45]
dusek
Line 57: Line 57:
     * but rules are very similar, so we also need something less fine-grained     * but rules are very similar, so we also need something less fine-grained
   * C2 is a target-side feature, just counts the target side tokens (only the "most important" ones; in terms of TF-IDF)   * C2 is a target-side feature, just counts the target side tokens (only the "most important" ones; in terms of TF-IDF)
-    * It may be compared to Language Model features, but is trained only on the target part of the bilingual training data.+    * It may be compared to Language Model features, but is trained only on the target part of the bilingual tuning data.
   * C3 counts occurrences of source-target token pairs (and uses the "most important" term pair for each rule, again)   * C3 counts occurrences of source-target token pairs (and uses the "most important" term pair for each rule, again)
  
Line 63: Line 63:
   * They need two passes through the data   * They need two passes through the data
   * You need to have document segmentation   * You need to have document segmentation
-    * Since the frequencies are trained on the training set, you can just translate one document at a time, no need to have full sets of documents+    * Since the frequencies are trained on the tuning set (see Sec. 5), you can just translate one document at a time, no need to have full sets of documents
  
 ==== Sec. 5. Evaluation and Discussion ==== ==== Sec. 5. Evaluation and Discussion ====

[ Back to the navigation ] [ Back to the content ]