This is an old revision of the document!
- After reading the first three chapters:
- list the main parts/components/structures of the model.
- Is their creation dependent on other components?
- Thinking about the scripts:
- What is the main reason (the biggest advantage) of using scripts? What kind of information does it bring? (Hint: page 2, page 8)
- The authors don't get the “knowledge” of scripts straightforwardly. How is that “knowledge” represented in the model and which (four) ways are used to get it?
- In the last paragraph of Section 3, a method is described that enhances the robustness of the model (binarization of all association weights <latex>w^z_i</latex>). Answer one of the following questions (choose one):
- Why does it work? (⇒ Why should it work best?)
- Do you have any idea how to do it differently?
- Experiments: Which “tricks”/“parts of processing” enhanced the Attribute recognition and Composite activity classification tasks “the most”? Try to answer why.
Answers
- First set
- list components (Google doc graph)
- dependance of components (the same graph)
- Scripts
- reason?: Cheap source of training data, Many combinations, unseen variants, “decsriptions” of the same thing
- four ways: 2×2: 1) direct use of words from data or 2) mapping word classes from WordNet X 3) simple word frequency or 4) TF*IDF
- There was a discussion about 3rd set of question. We are not sure why authors do that. There was strongly supported opinion that autohors do a lot unnecessary work, which is lost by binarization.
- 4th: Majority people in aswers nominated the use of TF*IDF in case of no training data as good idea.