[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
courses:rg:2011:deciphering_foreign_language [2012/01/07 13:15]
tran
courses:rg:2011:deciphering_foreign_language [2012/01/07 13:41]
tran
Line 11: Line 11:
 \mathop {\arg \max }\limits_\theta  \prod\limits_\theta  {p_\theta  (f|e)}  \mathop {\arg \max }\limits_\theta  \prod\limits_\theta  {p_\theta  (f|e)} 
 </latex> </latex>
 +
 +In case we do not have parallel data, we observe foreign text and try to maximize likelihood 
 +<latex>
 +\mathop {\arg \max }\limits_\theta  \prod\limits_f {p_\theta  (f)} 
 +</latex>
 +
 +Treating English translation as hidden alignment, our task is to find the parameter <latex>\theta</latex> that
 +<latex>
 +\mathop {\arg \max }\limits_\theta  \prod\limits_f {\sum\limits_e {P(e) \times \sum\limits_a {P_\theta  (f,a|e)} } } 
 +</latex>
 +
 +==== Section 2 ====
 +Section 2 deals with a simple version of translation, Word Substitution Decipherment, where there is only one-to-one mapping between source string and cipher string (the position of string does not change.)
 +
 +The solution for this problem is pretty simple: Given a sequence of English tokens <latex>e=e_1,e_2,...,e_n</latex>, and the corresponding sequence of cipher tokens <latex>c=c_1,c_2,...,c_n</latex>, we need to estimate parameter <latex>\theta</latex>:
 +<latex>
 +\mathop {\arg \max }\limits_\theta  \prod\limits_c {P_\theta  (c)}  = \mathop {\arg \max }\limits_\theta  \prod\limits_c {\sum\limits_e {P(e) \times \prod\limits_{i = 1}^n {P_\theta  (c_i |e_i )} } } 
 +</latex>
 +
  

[ Back to the navigation ] [ Back to the content ]