[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
khresmoi:en-fr [2012/01/19 11:47]
hlavacova
khresmoi:en-fr [2012/02/29 14:20]
hlavacova
Line 1: Line 1:
 ===== Paralelní data EN-FR ===== ===== Paralelní data EN-FR =====
-Zatím mám vše uloženo u sebe.  --- //[[hlavacova@ufal.mff.cuni.cz|hlavacova]] 2012/01/19 11:08// + 
-==== EMEA ==== +==== LDC ==== 
-Zdrojhttp://opus.lingfil.uu.se/EMEA.php +:?
-**en-fr.tmx.gz** ... alignovana data - download translation memory files (TMX)373 152 sentence pairs +  * **Hansard French/English** ... LDC Catalog No.: LDC95T20government documents 
-**en-fr.xml.gz** ... sentence alignments in XCES format +To by bylo třeba objednat, ale je to drahé: 
-**en-fr.txt.zip** ... jen angltexty o lécich - vypadá to jako příbalové letáky 1 092 568 sentences26,34M wordsdownload plain text files (MOSES/GIZA++) +Member fee: $0 for 1995, 1996, 1997 members 
-Adresář **fr** obsahuje francouzské textysnad paralelní k en-fr.txt.zip (ověřím), v nějakém XML, morfologicky označkované. 1987 files, 14.9M tokens, 1.2M sentences +Reduced-License Fee: US $3250.00 
-                +  * **UN Parallel Text (Complete)** ... LDC Catalog No.: LDC94T4A, jazyky ENFRSPgovernment documents 
 +To by bylo třeba objednatale je to drahé: 
 +Member fee: $0 for 1994 members 
 +Non-member Fee: US $4000.00 
 +Reduced-License Fee: US $2000.00 
 + 
 + 
 + 
  

[ Back to the navigation ] [ Back to the content ]