Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:addicter [2011/05/16 07:06] mphi |
external:addicter [2011/05/22 09:31] (current) mphi |
||
---|---|---|---|
Line 4: | Line 4: | ||
This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | ||
+ | |||
+ | ==== TODOs ==== | ||
+ | * test alignment with synonym detection (cz_wn required) = separating '' | ||
+ | * try misplaced phrase detection | ||
+ | * parse the reference, extrapolate unto the hypothesis via word alignment, get phrases from there | ||
+ | * group adjacent words aligned to the same word? (imitation) | ||
+ | * order evaluation | ||
+ | * currently finds misplaced items, but their shift distances are off | ||
+ | * not important for '' | ||
+ | * to fix -- for every misplaced token | ||
+ | * if it (and only it) were to be moved in the original permutation, | ||
+ | * evaluate with nr. of intersections | ||
+ | * try domain adaptation for word alignment with the "via source" | ||
+ | * technical | ||
+ | * comb and comment the code | ||
+ | * add help files | ||
+ | * integrate with the rest of Addicter | ||
+ | * approach applicable to learner' | ||
+ | * see Anne Lüdelig, TLT9 | ||
+ | * try Blast (Sara' | ||
+ | * alternative to reference-based evaluation: " | ||
==== Word Alignment -- Progress and Results ==== | ==== Word Alignment -- Progress and Results ==== | ||
+ | |||
+ | === Latest best results === | ||
+ | [[http:// | ||
=== Alternative model comparison === | === Alternative model comparison === | ||
- | || || Precision/ | ||
- | || || Lex || Order || Punct || Miss || | ||
- | || meteor | ||
- | || ter* ||0.106/ | ||
- | || hmm | ||
- | || lcs | ||
- | || gizadiag* ||0.183/ | ||
- | || gizainter ||0.170/ | ||
- | || berkeley* ||0.200/ | ||
- | === Explicit wrong lex choice detection === | + | hmm = lightweight direct alignment method (in our ACL/TSD article) |
+ | gizainter | ||
+ | gizadiag | ||
+ | czenginter | ||
+ | czengdiag | ||
- | * align input+czeng to reference+czeng and input+czeng to hypotheses+czeng | + | | |
- | * extract hypothesis-to-reference alignments from there | + | | |
+ | ^ ter* | ||
+ | ^ meteor | ||
+ | ^ hmm | ||
+ | ^ lcs | ||
+ | ^ gizainter |0.170/ | ||
+ | ^ gizadiag* |0.183/ | ||
+ | ^ czengdiag* |0.187/ | ||
+ | ^ berkeley* |0.200/ | ||
+ | ^ czenginter |0.197/ | ||
- | || border=1 | + | * non-1-to-1 alignments, converted to 1-to-1 via " |
- | || || Precision/ | + | |
- | || || Lex || Order || Punct || Miss || | + | |
- | || czengdiag* ||0.187/0.514/**0.275** ||0.069/ | + | |
- | || czenginter ||0.197/ | + | |
- | === Alignment combinations | + | === Alignment combinations === |
+ | via weighed HMM | ||
- | || border=1 | + | | ^ Precision/ |
- | || | + | | ^ Lex ^ Order ^ Punct ^ Miss ^ |
- | || | + | ^ ter+hmm |
- | || meteor+hmm ||0.162/0.426/**0.234** ||0.068/0.309/**0.112** ||0.286/0.794/**0.421** ||0.025/0.400/**0.047** || | + | ^ meteor+hmm |0.162/0.426/**0.234** |0.068/0.309/**0.112** |0.286/0.794/**0.421** |0.025/0.400/**0.047** | |
- | || ter+hmm ||0.116/0.402/**0.180** ||0.030/0.184/**0.051** ||0.145/0.912/**0.251** ||0.026/0.181/**0.046** || | + | ^ gizadiag+hmm |
- | || gizadiag+hmm | + | ^ gizainter+hmm |
- | || gizainter+hmm | + | ^ berkeley+hmm |
- | || berkeley+hmm | + | ^ czengdiag+hmm |
- | || czengdiag+hmm | + | ^ czenginter+hmm |
- | || czenginter+hmm | + | | ||||| |
- | || |||||||||| | + | ^ berk+czengint+hmm |0.219/ |
- | || berkeley+czenginter+hmm ||0.219/ | + | ^ berk+czengint+gizaint+hmm |0.220/ |
- | || berkeley+czenginter+gizainter+hmm ||0.220/ | + | ^ berk+czengint+meteor+hmm |0.220/ |
- | || berkeley+czenginter+meteor+hmm | + | ^ berk+czengint+meteor+gizaint+hmm |0.221/ |
- | ==== TODOs ==== | ||
- | * try domain adaptation for word alignment, EMNLP 2011 paper | ||
- | * test alignment with synonym detection (cz_wn required) = separating @@lex@@ and @@disam@@ | ||
- | * order evaluation | ||
- | * a lot of background research | ||
- | * currently finds misplaced items, but their shift distances are off | ||
- | * to fix -- for every misplaced token | ||
- | * if it (and only it) were to be moved in the original permutation, | ||
- | * evaluate with nr. of intersections | ||
- | * comb and comment the code | ||
- | * add help files | ||
- | * integrate with the rest of Addicter | ||
- | * learner' | ||
- | * see Anne Lüdelig, TLT9 | ||
- | * adapt to Sara's program | ||
- | * alternative to reference-based evaluation: " | ||