[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
courses:rg:2012:rosareport [2012/09/17 01:26]
rosa adding discussion
courses:rg:2012:rosareport [2012/09/17 01:38] (current)
rosa adding more discussion
Line 14: Line 14:
  
   * Why does the paper talk about "forced alignment"?   * Why does the paper talk about "forced alignment"?
-    * It seems they actually perform forced decoding.+    * It seems they actually perform "forced decoding".
   * The paper describes three ways of using the alignment (best alignment, n-best alignments, all alignments), but Formula (3) only applies to the first way.   * The paper describes three ways of using the alignment (best alignment, n-best alignments, all alignments), but Formula (3) only applies to the first way.
   * The authors claim to have avoided using "heuristics" for phrase alignment. However, they do use the heuristics both in the first part and then in the interpolation.   * The authors claim to have avoided using "heuristics" for phrase alignment. However, they do use the heuristics both in the first part and then in the interpolation.
Line 20: Line 20:
     * The whole sentence is by definition always consistent with the word alignment.     * The whole sentence is by definition always consistent with the word alignment.
     * There may be a hard limit for maximum phrase length, but this is not mentioned in the paper.     * There may be a hard limit for maximum phrase length, but this is not mentioned in the paper.
-  * +  * We discussed whether even singleton phrases should be extracted or whether it would be better to skip them. 
 +    * Apparently, practice shows that there is little reason to skip them, as it is usually better to have more data. 
 +  * We found that the meaning of "cross-validation" is unclear from the paper. 
 +    * It seems to us that the authors simply used a misleading term here, as they use the procedure not for validation but for training (probably they just perform "leaving 10,000 out" instead of "leaving 1 out"). 
 +  * We discussed whether in Table 4, N stands for the number of different alignments for a pair of sentences, but found out that we are rather unsure about that.
  

[ Back to the navigation ] [ Back to the content ]