[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:rg:extracting-parallel-sentences-from-comparable-corpora [2011/05/22 19:16]
ivanova
courses:rg:extracting-parallel-sentences-from-comparable-corpora [2011/05/22 19:23] (current)
ivanova
Line 1: Line 1:
 +**Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment**
 +//Jason R. Smith Chris Quirk and Kristina Toutanova//
 + 
 ====== Introduction ====== ====== Introduction ======
 +
 Article is about parallel sentence extraction from Wikipedia. This resource can be viewed as comparable corpus in which the document alignment is already provided by the interwiki links. Article is about parallel sentence extraction from Wikipedia. This resource can be viewed as comparable corpus in which the document alignment is already provided by the interwiki links.
  
Line 76: Line 80:
   * Induced word-level lexicon in combination with sentence extraction helps to achieve substantial gains.   * Induced word-level lexicon in combination with sentence extraction helps to achieve substantial gains.
  
-===== Strong sides of the article =====+====== Strong sides of the article =====
  
   * Novel approaches to extracting parallel sentences.   * Novel approaches to extracting parallel sentences.
Line 88: Line 93:
 Our understanding of this  feature is: Our understanding of this  feature is:
 TOPIC A: EN <-> ES TOPIC A: EN <-> ES
-          ↓     +         ↓      ↓     
 TOPIC B: EN <-> ES TOPIC B: EN <-> ES
  
 where  where 
 ↓ is a link ↓ is a link
 +
 <-> is an interwiki link <-> is an interwiki link
  
  
- --- //Angelina Ivanova //+ --- //Comments by Angelina Ivanova //
  
  
  
  

[ Back to the navigation ] [ Back to the content ]