Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
courses:rg:2011:deciphering_foreign_language [2012/01/07 13:14] tran |
courses:rg:2011:deciphering_foreign_language [2012/01/07 14:06] tran |
||
---|---|---|---|
Line 5: | Line 5: | ||
The talk is about how to tackle MT without parallel training data. | The talk is about how to tackle MT without parallel training data. | ||
- | ==Section 1== | + | ==== Section 1 ==== |
Given sentence pairs (e,f) where e is an English sentence and f is a foreign sentence, the translation model estimates parameter | Given sentence pairs (e,f) where e is an English sentence and f is a foreign sentence, the translation model estimates parameter | ||
< | < | ||
Line 11: | Line 11: | ||
\mathop {\arg \max }\limits_\theta | \mathop {\arg \max }\limits_\theta | ||
</ | </ | ||
+ | |||
+ | In case we do not have parallel data, we observe foreign text and try to maximize likelihood | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | Treating English translation as hidden alignment, our task is to find the parameter < | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | ==== Section 2 ==== | ||
+ | Section 2 deals with a simple version of translation, | ||
+ | |||
+ | The solution for this problem is pretty simple: Given a sequence of English tokens < | ||
+ | < | ||
+ | \mathop {\arg \max }\limits_\theta | ||
+ | </ | ||
+ | |||
+ | The key idea of section 2 is the Iterative EM algorithm, which is used to estimate < | ||
+ | |||
+ | If we use traditional EM, every time we update < | ||
+ | |||
+ | __Practical question:__ How to initiate EM? | ||
+ | |||
+ | |||