Differences

This shows you the differences between two versions of the page.

--- courses:rg:2013:memm [2014/10/12 15:03]
popel
+++ courses:rg:2013:memm [2014/10/12 15:04] (current)
popel
@@ Line 1: / Line 1: @@
 ===== Maximum Entropy Markov Models - Questions =====
-**1. Explain (roughly) how the new formula for α_t+1(s) is derived (i.e. formula 1 in the paper).**
+. Explain (roughly) how the new formula for α_t+1(s) is derived (i.e. formula 1 in the paper).
-**2. Section 2.1 states "we will split P(s|s',o) into |S| separately trained transition functions". What are the advantages and disadvantages of this approach?**
+. Section 2.1 states "we will split P(s|s',o) into |S| separately trained transition functions". What are the advantages and disadvantages of this approach?
-**3. Let S= {V,N} (verb and non-verb)
+. Let S= {V,N} (verb and non-verb)
 Training data = he/N can/V can/V a/N can/N
 //Observation features// are:
@@ Line 12: / Line 12: @@
 b3 = current word is “a” and next word is “can”
 When implementing MEMM you need to define s_0, i.e. the previous state before the first token. It may be a special NULL, but for simplicity let’s define it as N.
-a) What are the states (s) and observations (o) for this training data?**
+a) What are the states (s) and observations (o) for this training data?
-**b) Equation (2) defines features f_a based on //observation features// b. How many such f_a features do we have?**
+b) Equation (2) defines features f_a based on //observation features// b. How many such f_a features do we have?
-**c) Equation (3) defines constraints. How many such constraints do we have?**
+c) Equation (3) defines constraints. How many such constraints do we have?
-**d) List all the constraints involving feature b2, i.e. substitute (whenever possible) concrete numbers into Equation (3).**
+d) List all the constraints involving feature b2, i.e. substitute (whenever possible) concrete numbers into Equation (3).
-**e) In step 3 of the GIS algorithm you need to compute <latex>P_{s’}^{(j)}(s|o)</latex>. Compute <latex>P_N^{(0)}(N|can)</latex> and <latex>P_N^{(0)}(V|can)</latex>.**
+e) In step 3 of the GIS algorithm you need to compute <latex>P_{s’}^{(j)}(s|o)</latex>. Compute <latex>P_N^{(0)}(N|can)</latex> and <latex>P_N^{(0)}(V|can)</latex>.
 **Hint** : You might be confused about the m_s' variable (and  t_1, …, <latex>t_{m_{s'}}</latex>) in Equation (3).

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences