[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

This is an old revision of the document!


SMT

GoogleMT

Franz Josef Och (och@google.com, research-translationapi@google.com)
arab (58bleu), čínština, ruština

arab:

z news

tokens bleu
400M 53,5
47,2B 57,3
1,9T 58

5-gramy, když místo tak 7-gram

LM: 5^11 1TB
stovky tisíc strojů

HW: MapReduce - paralelismus a distribuovanost, +fault tolerance

Problemy: Named Entities v čínštine hlavne

nabizene api
n-best list plus score
aligment information


[ Back to the navigation ] [ Back to the content ]