courses:rg:2014:mdsm [ufal wiki]

This is an old revision of the document!

You should focus on the first paper (skip section 2.3)
The second paper, an extent of the first one, is optional reading.

Q1.
a)Recall the paper about word representations presented by Tam on November 10.
Read http://www.quora.com/Whats-the-difference-between-distributed-and-distributional-semantic-representations

(M_{w,d} is a matrix with w rows and d columns).
What does w, d and k mean?
What are the values of w, d and k used in the experiments in this paper?

b) What is maximum dimension of a word vector in distributional representation approach?

Q2.
a) Compute the similarity between two words “Moon” and “Mars” from the co-occurrence matrix below.
Use these raw counts (no Local Mutual Information, no normalization) and cosine similarity.

         | planet | night | full | shadow | shine       
  Moon   |   34   |   27  |  19  |   9    |   20
  Sun    |   32   |   23  |  10  |   47   |   15
  Dog    |   0    |   19  |  2   |   11   |   1
  Mars   |   44   |   23  |  17  |   3    |   9

b) How do they deal with high dimension of vectors in those papers?
Can you suggest some other techniques to manage vector dimension?

Q3.
a) What are Bag of Word (BOVW) and Bag of Visual Word (BOW)?
b) How do they apply BOVW to compute representation of a word (concept) from a large set of Images?

Q4.
When they construct text-based vectors of words from DM model
they mentioned Local Mutual Information score. (section 3.2, also section 2.1 in the 2nd paper)
So what is that score? Why did they use it?

Q5:
Have you ever wished to see beautiful “Mermaids”?
Have you ever seen “Unicorns” in the real life?
“Assume that there are no photos of them on the Internet”

Think about a computational way to show that how they look like?

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki