Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:addicter [2011/05/16 07:08] mphi |
external:addicter [2011/05/22 09:31] (current) mphi |
||
---|---|---|---|
Line 4: | Line 4: | ||
This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | ||
+ | |||
+ | ==== TODOs ==== | ||
+ | * test alignment with synonym detection (cz_wn required) = separating '' | ||
+ | * try misplaced phrase detection | ||
+ | * parse the reference, extrapolate unto the hypothesis via word alignment, get phrases from there | ||
+ | * group adjacent words aligned to the same word? (imitation) | ||
+ | * order evaluation | ||
+ | * currently finds misplaced items, but their shift distances are off | ||
+ | * not important for '' | ||
+ | * to fix -- for every misplaced token | ||
+ | * if it (and only it) were to be moved in the original permutation, | ||
+ | * evaluate with nr. of intersections | ||
+ | * try domain adaptation for word alignment with the "via source" | ||
+ | * technical | ||
+ | * comb and comment the code | ||
+ | * add help files | ||
+ | * integrate with the rest of Addicter | ||
+ | * approach applicable to learner' | ||
+ | * see Anne Lüdelig, TLT9 | ||
+ | * try Blast (Sara' | ||
+ | * alternative to reference-based evaluation: " | ||
==== Word Alignment -- Progress and Results ==== | ==== Word Alignment -- Progress and Results ==== | ||
+ | |||
+ | === Latest best results === | ||
+ | [[http:// | ||
=== Alternative model comparison === | === Alternative model comparison === | ||
- | | ^ Precision/ | ||
- | | ^ Lex ^ Order ^ Punct ^ Miss ^ | ||
- | ^ meteor | ||
- | ^ ter* ^0.106/ | ||
- | ^ hmm | ||
- | ^ lcs | ||
- | ^ gizadiag* ^0.183/ | ||
- | ^ gizainter ^0.170/ | ||
- | ^ berkeley* ^0.200/ | ||
- | === Explicit wrong lex choice detection === | + | hmm = lightweight direct alignment method (in our ACL/TSD article) |
- | | + | gizainter |
- | * extract hypothesis-to-reference alignments from there | + | gizadiag |
- | | | Precision/ | + | czenginter = align source+CzEng to reference+CzEng, |
- | | | + | czengdiag = same, but with GIZA++ grow-diag |
- | | czengdiag* |0.187/ | + | |
- | | czenginter |0.197/ | + | | ^ Precision/ |
+ | | ^ Lex | ||
+ | ^ ter* | ||
+ | ^ meteor | ||
+ | ^ hmm | ||
+ | ^ lcs |0.168/ | ||
+ | ^ gizainter |0.170/ | ||
+ | ^ gizadiag* |0.183/ | ||
+ | ^ czengdiag* |0.187/ | ||
+ | ^ berkeley* | ||
+ | ^ czenginter |0.197/ | ||
+ | |||
+ | * non-1-to-1 alignments, converted to 1-to-1 via " | ||
=== Alignment combinations === | === Alignment combinations === | ||
via weighed HMM | via weighed HMM | ||
- | | | Precision/ | + | | ^ Precision/ |
- | | | Lex | + | | ^ Lex ^ Order ^ Punct ^ Miss ^ |
- | | meteor+hmm |0.162/0.426/**0.234** |0.068/0.309/**0.112** |0.286/0.794/**0.421** |0.025/0.400/**0.047** | | + | ^ ter+hmm |
- | | ter+hmm |0.116/0.402/**0.180** |0.030/0.184/**0.051** |0.145/0.912/**0.251** |0.026/0.181/**0.046** | | + | ^ meteor+hmm |0.162/0.426/**0.234** |0.068/0.309/**0.112** |0.286/0.794/**0.421** |0.025/0.400/**0.047** | |
- | | gizadiag+hmm |0.186/ | + | ^ gizadiag+hmm |
- | | gizainter+hmm |0.194/ | + | ^ gizainter+hmm |
- | | berkeley+hmm |0.203/ | + | ^ berkeley+hmm |
- | | czengdiag+hmm |0.190/ | + | ^ czengdiag+hmm |
- | | czenginter+hmm |0.214/ | + | ^ czenginter+hmm |
| ||||| | | ||||| | ||
- | | berkeley+czenginter+hmm |0.219/ | + | ^ berk+czengint+hmm |0.219/ |
- | | berkeley+czenginter+gizainter+hmm |0.220/ | + | ^ berk+czengint+gizaint+hmm |0.220/ |
- | | berkeley+czenginter+meteor+hmm |0.220/ | + | ^ berk+czengint+meteor+hmm |0.220/ |
+ | ^ berk+czengint+meteor+gizaint+hmm |0.221/ | ||
- | ==== TODOs ==== | ||
- | * try domain adaptation for word alignment, EMNLP 2011 paper | ||
- | * test alignment with synonym detection (cz_wn required) = separating @@lex@@ and @@disam@@ | ||
- | * order evaluation | ||
- | * a lot of background research | ||
- | * currently finds misplaced items, but their shift distances are off | ||
- | * to fix -- for every misplaced token | ||
- | * if it (and only it) were to be moved in the original permutation, | ||
- | * evaluate with nr. of intersections | ||
- | * comb and comment the code | ||
- | * add help files | ||
- | * integrate with the rest of Addicter | ||
- | * learner' | ||
- | * see Anne Lüdelig, TLT9 | ||
- | * adapt to Sara's program | ||
- | * alternative to reference-based evaluation: " | ||