This is an old revision of the document!

Questions

Aron Culotta, Jeffrey Sorensen: Dependency Tree Kernels for Relation Extraction, ACL 2004.

Given Figure 1, what is the smallest common subtree that includes both t1 (Troops) and t2 (near)?
Section 5: “Therefore, d(a)=l(a).” When is this true and why? (Assume this holds for the following questions.)
Let <latex>\phi_m</latex> = {general-pos-tag, entity-type, relation-arguments} (in accordance with the paper). Let <latex>\phi_s = \phi_m</latex> (unlike in the paper). Based on Figure 2 and Section 5, compute the following matching functions and similarity functions:
- m(t0,u0)=? m(t1,u1)=? m(t2,u2)=?
- s(t0,u0)=? s(t1,u1)=? s(t2,u2)=?
Let <latex>\lambda=0.5</latex>. Compute the contiguous kernel for the two trees in Figure 2: <latex>K_1(T,U)=?</latex>. Provide the final number and some counts along the way, so its clear how you got the number. Optionally, compute also the sparse kernel <latex>K_0(T,U)=?</latex>.
Let DT be a function that assigns the correct augmented dependency tree to a sentence. Compute (estimate) contiguous kernel and bag-of-words kernel for the following sentences:
- <latex>K_1</latex>(DT(“Peter sleeps”), DT(“Bob runs”))=?
- <latex>K_2</latex>(DT(“Peter sleeps”), DT(“Bob runs”))=?
Lets have a pair of sentences:
- “Bob saw US troops that moved towards Baghdad”
- “US troops that moved towards Baghdad were seen by Bob”

You want to check the relation between entities “US” and “Baghdad”. Compute (estimate) <latex>K_1</latex> and <latex>K_2</latex>.

Answers

Depends on the exact definition of smallest common subtree, but keep in mind you need at least some non-trivial “context”. The definition should be such that contignous and sparse kernels will effecively be different things. The whole subtree is probably the right answer here.
d(a) is defined as the last member of the sequence - the first member + 1. If the sequence is contignous (no missing indices) it can be shown (eg. by induction) that the equation holds, unless some of the indices is repeated. Strictly speaking (1,1,1) is a valid sequence.
Depends on how you treat “N/A” values, by the definition you should sum values that are “the same/compatible” (disregarding the “type” of the feature).
- m(t0,u0)=1 m(t1,u1)=1 m(t2,u2)=0
- s(t0,u0)=5 s(t1,u1)=3 s(t2,u2)=0 (or not defined)
First this depends on the previous one (the “N/A” values) and second the paper doesn't say how to compute <latex>K_0(t_i[A],t_j[B])</latex>, where A,B are sequences with more than one member. One proposed solution was to use <latex>K_0(t_i[A],t_j[B])=\sum_{s=0..l(A)}K_0(t_i[a_s],t_j[b_s])</latex>
- <latex>K_0(T,U)=s(t_0,u_0)+\lambda^2K_0(t_1,u_1)+\lambda^2K_0(t_1,u_2)+\lambda^2K_0(t_2,u_1)+\lambda^2K_0(t_2,u_2)+\lambda^2K_0(t_3,u_1)+\lambda^2K_0(t_3,u_2)+\lambda^4K_0(\{t_1,t_2\},\{u_1,u_2\})+\lambda^4K_0(\{t_2,t_3\},\{u_1,u_2\})+\lambda^5K_0(\{t_1,t_3\},\{u_1,u_2\})= s(t_0,u_0)+\lambda^2s(t_1,u_1)+\lambda^2s(t_3,u_2)+\lambda^4s(t_4,u_3)+\lambda^4s(t_1,u_1)+\lambda^4s(t_3,u_2)+\lambda^6s(t_4,u_3)+\lambda^5s(t_1,u_1)+\lambda^5s(t_3,u_2)+\lambda^7s(t_4,u_3)</latex>
- When counting K_1 you leave out the <latex>(\{t_1,t_3\},\{u_1,u_2\})</latex> part
When you regard bag-of-words kernel as number of matching forms then K_2 is zero whereas K_1 is (very likely) positive
It was argued that we'll probably end up with different relation-args, thus there will be no match

Misc

There was some discussion what are the features for bag-of-words kernel (just presence of a word in sentence?)
Feature selection, mainly the relation-args feature
“Two level” classification, why it might be a good idea

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Table of Contents

Questions

Answers

Misc