1. In three main types of word representations described in the paper, to which types the following two samples belong:
a)
dog -0.087099201783 -0.136966257697 0.106813367913 [47 more numbers] cat -0.103287428163 -0.0066971301398 -0.0346911076188 [47 more numbers]
b)
dog 11010111010 cat 11010111010
2. Section 4.1 defines a corrupted (or noise) n-gram, but there is a tiny error/typo in the definition. Try nitpicking and point it out.
3. Section 7.4 states that “word representations in NER brought larger gains on the out-of-domain data than on the in-domain data.” Try to guess what is the reason.
4.
a) Does Table (1) include any compound feature?
b) Does it contain any compound feature with word representations?
c) Give an example of a possible compound feature with word representations for the NER task.
5.
Consider the C&W embedding vectors with 50 dimensions. Guess which word has the embedding vector most similar (by Euclidean distance) to the following vector:
a) vector(king) - vector(man) + vector(woman)
b) vector(dollars) - vector(dollar) + vector(mouse)
Hint : The paper is 11-page long. You can skip section 2 and section 3.2 which are the literature review.