[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:rg:2013:memm [2014/10/12 15:03]
popel
courses:rg:2013:memm [2014/10/12 15:04] (current)
popel
Line 1: Line 1:
 ===== Maximum Entropy Markov Models - Questions ===== ===== Maximum Entropy Markov Models - Questions =====
  
-**1. Explain (roughly) how the new formula for α_t+1(s) is derived (i.e. formula 1 in the paper).**+1. Explain (roughly) how the new formula for α_t+1(s) is derived (i.e. formula 1 in the paper).
  
-**2. Section 2.1 states "we will split P(s|s',o) into |S| separately trained transition functions". What are the advantages and disadvantages of this approach?**+2. Section 2.1 states "we will split P(s|s',o) into |S| separately trained transition functions". What are the advantages and disadvantages of this approach?
  
-**3. Let S= {V,N} (verb and non-verb)+3. Let S= {V,N} (verb and non-verb)
 Training data = he/N can/V can/V a/N can/N Training data = he/N can/V can/V a/N can/N
 //Observation features// are: //Observation features// are:
Line 12: Line 12:
 b3 = current word is “a” and next word is “can” b3 = current word is “a” and next word is “can”
 When implementing MEMM you need to define s_0, i.e. the previous state before the first token. It may be a special NULL, but for simplicity let’s define it as N. When implementing MEMM you need to define s_0, i.e. the previous state before the first token. It may be a special NULL, but for simplicity let’s define it as N.
-a) What are the states (s) and observations (o) for this training data?**+a) What are the states (s) and observations (o) for this training data?
  
-**b) Equation (2) defines features f_a based on //observation features// b. How many such f_a features do we have?**+b) Equation (2) defines features f_a based on //observation features// b. How many such f_a features do we have?
  
-**c) Equation (3) defines constraints. How many such constraints do we have?**+c) Equation (3) defines constraints. How many such constraints do we have?
  
-**d) List all the constraints involving feature b2, i.e. substitute (whenever possible) concrete numbers into Equation (3).**+d) List all the constraints involving feature b2, i.e. substitute (whenever possible) concrete numbers into Equation (3).
  
-**e) In step 3 of the GIS algorithm you need to compute <latex>P_{s’}^{(j)}(s|o)</latex>. Compute <latex>P_N^{(0)}(N|can)</latex> and <latex>P_N^{(0)}(V|can)</latex>.**+e) In step 3 of the GIS algorithm you need to compute <latex>P_{s’}^{(j)}(s|o)</latex>. Compute <latex>P_N^{(0)}(N|can)</latex> and <latex>P_N^{(0)}(V|can)</latex>.
  
 **Hint** : You might be confused about the m_s' variable (and  t_1, …, <latex>t_{m_{s'}}</latex>) in Equation (3). **Hint** : You might be confused about the m_s' variable (and  t_1, …, <latex>t_{m_{s'}}</latex>) in Equation (3).

[ Back to the navigation ] [ Back to the content ]