Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:addicter [2011/05/16 07:18] mphi |
external:addicter [2011/05/22 09:31] (current) mphi |
||
---|---|---|---|
Line 4: | Line 4: | ||
This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | This page lies in the external name space and is intended for collaboration with people outside of ÚFAL. | ||
+ | |||
+ | ==== TODOs ==== | ||
+ | * test alignment with synonym detection (cz_wn required) = separating '' | ||
+ | * try misplaced phrase detection | ||
+ | * parse the reference, extrapolate unto the hypothesis via word alignment, get phrases from there | ||
+ | * group adjacent words aligned to the same word? (imitation) | ||
+ | * order evaluation | ||
+ | * currently finds misplaced items, but their shift distances are off | ||
+ | * not important for '' | ||
+ | * to fix -- for every misplaced token | ||
+ | * if it (and only it) were to be moved in the original permutation, | ||
+ | * evaluate with nr. of intersections | ||
+ | * try domain adaptation for word alignment with the "via source" | ||
+ | * technical | ||
+ | * comb and comment the code | ||
+ | * add help files | ||
+ | * integrate with the rest of Addicter | ||
+ | * approach applicable to learner' | ||
+ | * see Anne Lüdelig, TLT9 | ||
+ | * try Blast (Sara' | ||
+ | * alternative to reference-based evaluation: " | ||
==== Word Alignment -- Progress and Results ==== | ==== Word Alignment -- Progress and Results ==== | ||
- | Best result so far (Berkeley+CzengInter+HMM combo): {{wiki: | + | === Latest best results === |
+ | [[http://mtj.ut.ee/ | ||
=== Alternative model comparison === | === Alternative model comparison === | ||
+ | |||
+ | hmm = lightweight direct alignment method (in our ACL/TSD article) | ||
+ | gizainter = GIZA++, intersection -- applied to hypotheses+references directly | ||
+ | gizadiag = GIZA++, grow-diag -- applied to hypotheses+references directly | ||
+ | czenginter = align source+CzEng to reference+CzEng, | ||
+ | czengdiag = same, but with GIZA++ grow-diag | ||
+ | |||
| | | | ||
| | | | ||
+ | ^ ter* |0.106/ | ||
^ meteor | ^ meteor | ||
- | ^ ter* |0.106/ | ||
^ hmm | ^ hmm | ||
^ lcs | ^ lcs | ||
- | ^ gizadiag* |0.183/ | ||
^ gizainter |0.170/ | ^ gizainter |0.170/ | ||
- | ^ berkeley* |0.200/0.540/**0.291** |0.050/0.330/**0.087** |0.292/0.844/**0.434** |0.039/0.267/**0.068** | | + | ^ gizadiag* |0.183/0.512/**0.270** |0.044/0.250/**0.075** |0.285/0.784/**0.417** |0.038/0.224/**0.065** | |
- | + | ||
- | === Explicit wrong lex choice detection === | + | |
- | * align input+czeng to reference+czeng and input+czeng to hypotheses+czeng | + | |
- | * extract hypothesis-to-reference alignments from there | + | |
- | + | ||
- | | ^ Precision/ | + | |
- | | ^ | + | |
^ czengdiag* |0.187/ | ^ czengdiag* |0.187/ | ||
+ | ^ berkeley* |0.200/ | ||
^ czenginter |0.197/ | ^ czenginter |0.197/ | ||
+ | |||
+ | * non-1-to-1 alignments, converted to 1-to-1 via " | ||
=== Alignment combinations === | === Alignment combinations === | ||
Line 34: | Line 58: | ||
| | | | ||
| | | | ||
- | ^ meteor+hmm | ||
^ ter+hmm | ^ ter+hmm | ||
+ | ^ meteor+hmm | ||
^ gizadiag+hmm | ^ gizadiag+hmm | ||
^ gizainter+hmm | ^ gizainter+hmm | ||
Line 42: | Line 66: | ||
^ czenginter+hmm | ^ czenginter+hmm | ||
| ||||| | | ||||| | ||
- | ^ berkeley+czenginter+hmm |0.219/ | + | ^ berk+czengint+hmm |0.219/ |
- | ^ berkeley+czenginter+gizainter+hmm |0.220/ | + | ^ berk+czengint+gizaint+hmm |0.220/ |
- | ^ berkeley+czenginter+meteor+hmm |0.220/ | + | ^ berk+czengint+meteor+hmm |0.220/ |
+ | ^ berk+czengint+meteor+gizaint+hmm |0.221/ | ||
- | ==== TODOs ==== | ||
- | * try domain adaptation for word alignment, EMNLP 2011 paper | ||
- | * test alignment with synonym detection (cz_wn required) = separating '' | ||
- | * order evaluation | ||
- | * a lot of background research | ||
- | * currently finds misplaced items, but their shift distances are off | ||
- | * to fix -- for every misplaced token | ||
- | * if it (and only it) were to be moved in the original permutation, | ||
- | * evaluate with nr. of intersections | ||
- | * comb and comment the code | ||
- | * add help files | ||
- | * integrate with the rest of Addicter | ||
- | * learner' | ||
- | * see Anne Lüdelig, TLT9 | ||
- | * adapt to Sara's program | ||
- | * alternative to reference-based evaluation: " | ||