This is an old revision of the document!
Q1. Recall from the paper presented by Tam three week ago.
a) What is the difference between distributional semantic and distributed semantic representation?
b) What is maximum dimension of a word vector in distributional representation approach?
Q2.
a) Compute the similarity between two words “Moon” and “Mars” from the co-occurrence matrix below:
| planet | night | full | shadow | shine Moon | 34 | 27 | 19 | 9 | 20 Sun | 32 | 23 | 10 | 47 | 15 Dog | 0 | 19 | 2 | 11 | 1 Mars | 44 | 23 | 17 | 3 | 9
b) How do they manage size of dimension of vectors in those papers? (looking at the 1st paper)
Do you think it is a bit disadvantage?
Can you suggest some techniques for them to manage vector dimension?
Q3.
a) What are Bag of Word (BOVW) and Bag of Visual Word (BOW)? Are they synonyms?
b) How do they apply BOVW to compute representation of a word (concept) from a large set of Images?
(note: they used some different visual features in two papers)
Q4. When they construct text-based vectors of words from a corpus in 2nd (section 2.1)
they mentioned LMI score. So what is that score? Why did they use it?
Q5: Have you ever wished to see a beautiful “Mermaids”.
Have you ever seen “Unicorns” in the real life.
Think about a computational way to show that how they look like?