Transductive learning for statistical machine translation

Nicola Ueffing and Gholamreza Haffari and Anoop Sarkar

Introduction

The paper is about the use of transductive semi-supervised methods for the effective use of monolingual data from the source language in order to improve translation quality. Transductive means that they repeatedly translate sentences from the development set or test set and use generated translation to improve the SMT system. Transductive learning is another mean to adapt the SMT system to a new type of text.

Authors mention two SMT modeling problems which need different learning strategies for improving the translation quality.

1. SMT systems face data sparseness issue even if there is large bitext available for any language pair.
2. For many language pairs the amount of available bilingual text is very limited.

The authors hypothesis is that adding information from source data might help in improvements.

Comments

The Paper very well describes the transductive learning algorithm, Algorithm 1 which is inspired by Yarowsky algorithm [1].

* In algorithm 1, the translation model is estimated based on the sentence pairs in bilingual data L. Then a set of source language sentences, U, is translated based on the current model. A subset of good transaltions and their sources, Ti, is selected on each iteration and added to the training data. These sentence pairs are replaces in each iteration and only the original data, L, is fixed throughout algorithm.

* Algorithm 1 is based on Estimate, Score and Select functions.

* Estimate function estimates the model parameters or in other words perform training of the system. The authors used three different model for parameters estimation. Full Re-training, Additional Phrase Table and Mixture Model.

* Scoring function assign a score to each translation t. The scoring functions used in the paper are: Length-normalized Score and Confidence Estimation.

* Selection function is used to create additional training data Ti which is used in next iteration i+1 by Estimate to augment the original bilingual data. The selection functions used in this paper are: Importance Sampling, Selection using a Threshold and Keep All.

* Data filtering is performed on both bilingual and monolingual data to keep only that part of the data which is relevant to the test data.

What do we like about the paper

What do we dislike about the paper

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Table of Contents