[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


Deciphering Foreign Language

Scriber: Ke. T

The talk is about how to tackle MT without parallel training data.

Section 1

Given sentence pairs (e,f) where e is an English sentence and f is a foreign sentence, the translation model estimates parameter
<latex>\theta</latex> such that
<latex>
\mathop {\arg \max }\limits_\theta \prod\limits_\theta {p_\theta (f|e)}
</latex>

In case we do not have parallel data, we observe foreign text and try to maximize likelihood
<latex>
\mathop {\arg \max }\limits_\theta \prod\limits_f {p_\theta (f)}
</latex>

Treating English translation as hidden alignment, our task is to find the parameter <latex>\theta</latex> that
<latex>
\mathop {\arg \max }\limits_\theta \prod\limits_f {\sum\limits_e {P(e) \times \sum\limits_a {P_\theta (f,a|e)} } }
</latex>


[ Back to the navigation ] [ Back to the content ]