Differences

This shows you the differences between two versions of the page.

--- courses:rg:2011:deciphering_foreign_language [2012/01/07 13:15]
tran
+++ courses:rg:2011:deciphering_foreign_language [2012/01/07 13:41]
tran
@@ Line 11: / Line 11: @@
 \mathop {\arg \max }\limits_\theta  \prod\limits_\theta  {p_\theta  (f|e)}
 </latex>
+In case we do not have parallel data, we observe foreign text and try to maximize likelihood
+<latex>
+\mathop {\arg \max }\limits_\theta  \prod\limits_f {p_\theta  (f)}
+</latex>
+Treating English translation as hidden alignment, our task is to find the parameter <latex>\theta</latex> that
+<latex>
+\mathop {\arg \max }\limits_\theta  \prod\limits_f {\sum\limits_e {P(e) \times \sum\limits_a {P_\theta  (f,a|e)} } }
+</latex>
+==== Section 2 ====
+Section 2 deals with a simple version of translation, Word Substitution Decipherment, where there is only one-to-one mapping between source string and cipher string (the position of string does not change.)
+The solution for this problem is pretty simple: Given a sequence of English tokens <latex>e=e_1,e_2,...,e_n</latex>, and the corresponding sequence of cipher tokens <latex>c=c_1,c_2,...,c_n</latex>, we need to estimate parameter <latex>\theta</latex>:
+<latex>
+\mathop {\arg \max }\limits_\theta  \prod\limits_c {P_\theta  (c)}  = \mathop {\arg \max }\limits_\theta  \prod\limits_c {\sum\limits_e {P(e) \times \prod\limits_{i = 1}^n {P_\theta  (c_i |e_i )} } }
+</latex>

Institute of Formal and Applied Linguistics Wiki