This is an old revision of the document!

English-Hindi Translation – Obtaining Mediocre Results with Bad Data and Fancy Models

UNDER CONSTRUCTION!

This page is an add-on to the following paper:

Ondřej Bojar, Pavel Straňák, Daniel Zeman, Gaurav Jain, Michal Hrušecký, Michal Richter, Jan Hajič: English-Hindi Translation – Obtaining Mediocre Results with Bad Data and Fancy Models. In: Proceedings of ICON 2009, Hyderabad, December.

The purpose of the add-on page is to provide detailed documentation of the data, tools and settings used so that the results can be reproduced by other researchers.

Data
- IIIT-TIDES
- Daniel Pipes
- EMILLE
- Agrocorpus
- Shabdanjali
- Wikipedia Named Entities
Tools and their settings
- Tokenization and normalization of the data
- Hunalign
- GIZA++
- makecls
- SRILM
- Moses
- Joshua
- Mumbai Tagger
- Affisix
- Hindomor
- HiTBSuf
- počítadlo BLEU skóre
Link tables from the paper to concrete settings
Link to the PDF version of the paper; link to Biblio?

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

English-Hindi Translation – Obtaining Mediocre Results with Bad Data and Fancy Models

UNDER CONSTRUCTION!