This is an old revision of the document!

After reading the first three chapters:
- list the main parts/components/structures of the model.
- Is their creation dependent on other components?
Thinking about the scripts:
- What is the main reason (the biggest advantage) of using scripts? What kind of information does it bring? (Hint: page 2, page 8)
- The authors don't get the “knowledge” of scripts straightforwardly. How is that “knowledge” represented in the model and which (four) ways are used to get it?
In the last paragraph of Section 3, a method is described that enhances the robustness of the model (binarization of all association weights <latex>w^z_i</latex>). Answer one of the following questions (choose one):
- Why does it work? (⇒ Why should it work best?)
- Do you have any idea how to do it differently?
Experiments: Which “tricks”/“parts of processing” enhanced the Attribute recognition and Composite activity classification tasks “the most”? Try to answer why.

Answers

First set
- list components (Google doc graph)
- dependance of components (the same graph)
Scripts
- reason?: Cheap source of training data, Many combinations, unseen variants, “decsriptions” of the same thing
- four ways: 2×2: 1) direct use of words from data or 2) mapping word classes from WordNet X 3) simple word frequency or 4) TF*IDF
There was a discussion about 3rd set of question. We are not sure why authors do that. There was strongly supported opinion that autohors do a lot unnecessary work, which is lost by binarization.
4th: Majority people in aswers nominated the use of TF*IDF in case of no training data as good idea.

Institute of Formal and Applied Linguistics Wiki