Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
courses:rg:2011:deciphering_foreign_language [2011/12/06 09:52] tran vytvořeno |
courses:rg:2011:deciphering_foreign_language [2012/01/08 22:27] (current) tran |
||
---|---|---|---|
Line 2: | Line 2: | ||
Scriber: Ke. T | Scriber: Ke. T | ||
+ | |||
+ | The talk is about how to tackle MT without parallel training data. | ||
+ | |||
+ | ==== Section 1 ==== | ||
+ | Given sentence pairs (e,f) where e is an English sentence and f is a foreign sentence, the translation model estimates parameter | ||
+ | < | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | In case we do not have parallel data, we observe foreign text and try to maximize likelihood | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | Treating English translation as hidden alignment, our task is to find the parameter < | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | ==== Section 2 ==== | ||
+ | Section 2 deals with a simple version of translation, | ||
+ | |||
+ | The solution for this problem is pretty simple: Given a sequence of English tokens < | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | The key idea of section 2 is the Iterative EM algorithm, which is used to estimate < | ||
+ | |||
+ | If we use traditional EM, every time we update < | ||
+ | |||
+ | __**Practical questions: | ||
+ | |||
+ | **Some other notes related to this paper:** | ||
+ | - Generative story: The generative process that generates data given some hidden variables. | ||
+ | - [[http:// | ||
+ | - Gibbs sampling: | ||
+ | |||
+ | Why did they experiment with Temporal expression corpus? This corpus has relatively small word types, it makes easier to compare Iterative EM with full EM. | ||
+ | |||
+ | ==== Section 3 ==== | ||
+ | Not many details of this section was presented, however, there are few discussions around this. | ||
+ | |||
+ | How to choose the best translation? | ||
+ | |||
+ | Given another text (which is not in training data), how to translate it? Use MLE to find the best translation from the model. | ||
+ | |||
+ | ==== Conclusion ==== | ||
+ | This is an interesting paper, however, there is a lot of maths behind. | ||
+ | |||
+ | |||