Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
courses:rg:2012:spe-for-smt [2012/10/12 12:57] jindra.helcl |
courses:rg:2012:spe-for-smt [2012/10/12 14:17] (current) popel my notes |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Statistical Post-Editing for a Statistical MT System ====== | ====== Statistical Post-Editing for a Statistical MT System ====== | ||
- | |||
- | //*** !! Under construction !! ***// | ||
- | |||
//Hanna Béchara, Yanjun Ma, Josef van Genabith// | //Hanna Béchara, Yanjun Ma, Josef van Genabith// | ||
MT Summit 2011 | MT Summit 2011 | ||
+ | [[http:// | ||
Presented by Rudolf Rosa | Presented by Rudolf Rosa | ||
Report by Jindřich Helcl | Report by Jindřich Helcl | ||
+ | |||
===== Introduction ===== | ===== Introduction ===== | ||
This article was about statistical post-editing on results of a statistical machine translation system. The most interesting part on this article was that authors claim that they achieved improvement of about 2 BLEU score points by pipelining two statistical MT systems, which was until then considered useless. | This article was about statistical post-editing on results of a statistical machine translation system. The most interesting part on this article was that authors claim that they achieved improvement of about 2 BLEU score points by pipelining two statistical MT systems, which was until then considered useless. | ||
Line 15: | Line 14: | ||
A brief outline of the paper follows. In introduction, | A brief outline of the paper follows. In introduction, | ||
* **Data:** The data for the experiment came from English-French translation memory from Symantec. The size of the data was about 55k sentences (0.8M words) in each language. In the paper, they call the English training data **E** and the French data **F**. | * **Data:** The data for the experiment came from English-French translation memory from Symantec. The size of the data was about 55k sentences (0.8M words) in each language. In the paper, they call the English training data **E** and the French data **F**. | ||
- | * **Architecture: | + | * **Architecture: |
* **Enhancements: | * **Enhancements: | ||
- | * Contextual SPE, which means that the translated words was created by concatenating the English word and the translation separated by hash sign to one resulting word. This new dataset is called **E#F'** in the paper. With this enhancement, | + | * Contextual SPE, which means that the translated words was created by concatenating the English word and the translation separated by hash sign to one resulting word. This new dataset is called **F'#E** in the paper. With this enhancement, |
- | * Next, they striped off the #-postfixes of non-translated words. | + | * Next, they striped off the #-postfixes of non-translated words (OOV). |
* Then, they do alignment between the source text and the translation and use the contextual enhancement only where the alignment weight was over some threshold. | * Then, they do alignment between the source text and the translation and use the contextual enhancement only where the alignment weight was over some threshold. | ||
& | & | ||
Line 27: | Line 26: | ||
* As the main possible flaw of the experiment was assumed the size of the data (only 55k sentences). On the other hand, the data from translation memory were mentioned to be clean and there were not duplicities. However, the authors do not explain why they took so small data when other options are easily available. One possible explanation is that their translation system was built for the domain from the Symantec data - but this is not explicitly said in the article. | * As the main possible flaw of the experiment was assumed the size of the data (only 55k sentences). On the other hand, the data from translation memory were mentioned to be clean and there were not duplicities. However, the authors do not explain why they took so small data when other options are easily available. One possible explanation is that their translation system was built for the domain from the Symantec data - but this is not explicitly said in the article. | ||
* In the paper, they state that they use 10-fold cross validation approach to build a new dataset. Many of us have got confuset by this statement and found unclear what exactly the authors meant by this. We finally agreed that the new dataset is created fold-by-fold by training the SMT on the other 9 folds of **E** and **F** and then running it on the tenth fold of source language. | * In the paper, they state that they use 10-fold cross validation approach to build a new dataset. Many of us have got confuset by this statement and found unclear what exactly the authors meant by this. We finally agreed that the new dataset is created fold-by-fold by training the SMT on the other 9 folds of **E** and **F** and then running it on the tenth fold of source language. | ||
- | * # | + | * We found pointless for authors to present explicit results of Contextual SPE without removing the #-postfixes, as it was plain enough to remove them right away. This simple objection lead us to idea of removing the #-postfixes even before the OOV utterance is put to the language model, while it could bring some improvements. |
- | * alignment | + | * When the authors wrote about Contextual SPE with thresholding, |
- | * struktura článku | + | - Moses can output also the word-alignment together with the translations. Although the alignment originates from GIZA++ (it is //the// alignment which was used to build the phrase table), Ondřej Bojar says it is not usual to describe this approach //"we do this using GIZA++ word-alignments"// |
+ | - (New) GIZA++ can be trained (and applied) on the 55k sentence pairs (**E**, **F' | ||
+ | Method #1 should be more accurate, but it seems the authors used method #2. | ||
+ | * How could OOVs arise in post-editing? | ||
+ | - Either because they used method #2 which mis-aligned // | ||
+ | - Or the first stage system trained on particular 9/10 of the training data cannot translate // | ||
- | ===== Conclusion ===== | ||
- | - zhodnocení | + | ===== Conclusion ===== |
+ | Despite the structure of the paper was often critisized and possible flaws was found, the article was considered to be well-readable and simple enough to be the opening article for this semester' |