Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:addicter [2011/05/15 21:25] mphi |
external:addicter [2011/05/22 09:31] (current) mphi |
||
---|---|---|---|
Line 5: | Line 5: | ||
This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | ||
- | !! word alignment | + | ==== TODOs ==== |
- | !!! alternative model comparison | + | * test alignment with synonym detection (cz_wn required) |
- | || border=1 | + | * try misplaced phrase detection |
- | || || Precision/ | + | * parse the reference, extrapolate unto the hypothesis via word alignment, get phrases from there |
- | || || Lex || Order || Punct || Miss || | + | * group adjacent words aligned to the same word? (imitation) |
- | || meteor ||0.092/ | + | * order evaluation |
- | || ter* ||0.106/ | + | * currently finds misplaced items, but their shift distances are off |
- | || hmm ||0.162/ | + | * not important for '' |
- | || lcs ||0.168/ | + | * to fix -- for every misplaced token |
- | || gizadiag* ||0.183/ | + | * if it (and only it) were to be moved in the original permutation, |
- | || gizainter ||0.170/ | + | * evaluate with nr. of intersections |
- | || berkeley* ||0.200/ | + | * try domain adaptation for word alignment with the "via source" |
+ | * technical | ||
+ | * comb and comment the code | ||
+ | * add help files | ||
+ | * integrate with the rest of Addicter | ||
+ | * approach applicable to learner's corpora | ||
+ | * see Anne Lüdelig, TLT9 | ||
+ | * try Blast (Sara's program for translation error markup) | ||
+ | * alternative to reference-based evaluation: " | ||
- | !!! Explicit wrong lex choice detection | + | ==== Word Alignment -- Progress and Results ==== |
- | * align input+czeng to reference+czeng and input+czeng to hypotheses+czeng | + | |
- | * extract hypothesis-to-reference alignments from there | + | === Latest best results === |
- | || border=1 | + | [[http:// |
- | || | + | |
- | || | + | === Alternative model comparison === |
- | || czengdiag* ||0.187/ | + | |
- | || czenginter | + | hmm = lightweight direct alignment method (in our ACL/TSD article) |
- | * TODO: also try domain adaptation for word alignment, EMNLP 2011 paper | + | gizainter = GIZA++, intersection -- applied to hypotheses+references directly |
+ | gizadiag = GIZA++, grow-diag -- applied to hypotheses+references directly | ||
+ | czenginter = align source+CzEng to reference+CzEng, | ||
+ | czengdiag | ||
+ | |||
+ | | ^ Precision/ | ||
+ | | ^ Lex | ||
+ | ^ ter* | ||
+ | ^ meteor | ||
+ | ^ hmm |0.162/ | ||
+ | ^ lcs |0.168/ | ||
+ | ^ gizainter | ||
+ | ^ gizadiag* | ||
+ | ^ czengdiag* | ||
+ | ^ berkeley* | ||
+ | ^ czenginter | ||
+ | |||
+ | * non-1-to-1 alignments, converted to 1-to-1 via " | ||
+ | |||
+ | === Alignment combinations === | ||
+ | via weighed HMM | ||
+ | |||
+ | | | ||
+ | | | ||
+ | ^ ter+hmm | ||
+ | ^ meteor+hmm | ||
+ | ^ gizadiag+hmm | ||
+ | ^ gizainter+hmm | ||
+ | ^ berkeley+hmm | ||
+ | ^ czengdiag+hmm | ||
+ | ^ czenginter+hmm | ||
+ | | ||||| | ||
+ | ^ berk+czengint+hmm |0.219/ | ||
+ | ^ berk+czengint+gizaint+hmm |0.220/ | ||
+ | ^ berk+czengint+meteor+hmm |0.220/ | ||
+ | ^ berk+czengint+meteor+gizaint+hmm |0.221/ | ||
- | !!! alignment combinations via weighed HMM | ||
- | || border=1 | ||
- | || || Precision/ | ||
- | || || Lex || Order || Punct || Miss || | ||
- | || meteor+hmm ||0.162/ | ||
- | || ter+hmm ||0.116/ | ||
- | || gizadiag+hmm ||0.186/ | ||
- | || gizainter+hmm ||0.194/ | ||
- | || berkeley+hmm ||0.203/ | ||
- | || czengdiag+hmm ||0.190/ | ||
- | || czenginter+hmm ||0.214/ | ||
- | || |||||||||| | ||
- | || berkeley+czenginter+hmm ||0.219/ | ||
- | || berkeley+czenginter+gizainter+hmm ||0.220/ | ||
- | || berkeley+czenginter+meteor+hmm ||0.220/ | ||
- | !!! TODO | ||
- | * test alignment with synonym detection (cz_wn required) = separating @@lex@@ and @@disam@@ | ||
- | * order evaluation | ||
- | ** a lot of background research | ||
- | ** currently finds misplaced items, but their shift distances are off | ||
- | *** to fix -- for every misplaced token | ||
- | **** if it (and only it) were to be moved in the original permutation, | ||
- | **** evaluate with nr. of intersections | ||
- | * comb and comment the code | ||
- | * add help files | ||
- | * integrate with the rest of Addicter | ||
- | * learner' | ||
- | ** see Anne Lüdelig, TLT9 | ||
- | * adapt to Sara's program | ||
- | * alternative to reference-based evaluation: " |