1. What is (capital-letter) X in Equations 3 and 5?
2. What do we need to detect whether a given set of French sentences downloaded from web was produced by an MT system and watermarked?
a) know the source language
b) have the source sentences (at least for part of the set)
c) have the source code of the French web pages
d) have access to the version of the MT system which is supposed to produce the French sentences
e) know the hash function h used in watermarking
f) know the D_k function used in watermarking
g) know which sub-results were used (1-grams, 1-5grams,…)
h) know the value of parameter k used in watermarking
i) know which method of loss interpolation was used (max K-best/rank i./cost i.)
j) know the ranking function w
k) know the significance level alpha
3. Why is binomial distribution used? Could we choose another discrete distribution?
4. Guess what is the result of training an SMT system only on its (previous version) own outputs.
5. What hash function h would you suggest for the task of MT watermarking?
6. Can Google use this watermarking technique to detect (and filter out from training data) Microsoft's Bing translations?