Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
external:spr [2011/10/20 15:30] smejkalova |
external:spr [2012/01/19 10:47] (current) smejkalova |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Semantic Pattern Recognition ====== | + | ====== Semantic Pattern Recognition (SPR) -- project webpage ====== |
- | Doplnit strucny popis. | + | |
+ | |||
+ | |||
+ | ===== People and contacts ===== | ||
+ | |||
+ | * Silvie Cinková <cinkova (at) ufal.mff.cuni.cz> | ||
+ | * Martin Holub <holub (at) ufal.mff.cuni.cz> | ||
+ | * Lenka Smejkalová <smejkalova (at) ufal.mff.cuni.cz> | ||
+ | * Vincent Kríž <vincent.kriz (at) gmail.com> | ||
+ | |||
+ | Institute of Formal and Applied Linguistics | ||
+ | Charles University in Prague | ||
+ | Faculty of Mathematics and Physics | ||
+ | |||
+ | Malostranské náměstí 25 | ||
+ | CZ-118 00 Praha | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== CPA Verb Validation Sample 30 (En) ===== | ||
+ | |||
+ | "CPA Verb Validation Sample 30 (En)" is a newly developed lexical resource. It contains descriptions of the following 30 English verbs: | ||
+ | |||
+ | |//access // |//ally// |//arrive// |//breathe// |//claim// | | ||
+ | |//cool// |//cry// |//crush// |//deny// |//enlarge// | | ||
+ | |//enlist// |//forge// |//furnish// |//hail// |//halt// | | ||
+ | |//part// |//plug// |//plough// |//pour// |//say// | | ||
+ | |//smash// |//smell// |//steer// |//submit// |//swell// | | ||
+ | |//tell// | //throw// |//trouble// |//wake// |//yield// | | ||
+ | |||
+ | |||
+ | |||
+ | Here we present a small portion of the data just to illustrate its structure. For each verb we define a set of semantic patterns and provide a sample of manually annotated corpus concordances. Then we do a detailed interannotator disagreement analysis, error detection, and adjudication. | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Annotation Scheme Description ==== | ||
+ | * Pattern Definition Form - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/PDEV2.1-pattern-form.pdf|pdf]] | ||
+ | * Annotation Manual - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/annotation_manual.pdf|pdf]] | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Pattern Definitions Preview ==== | ||
+ | A few examples of revised PDEV entries: | ||
+ | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_patterns.html|cool]] | ||
+ | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_patterns.html|deny]] | ||
+ | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_patterns.html|yield]] | ||
+ | |||
+ | An example of a pattern definition form in detail: | ||
+ | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_7.png|deny - pattern number 7]] | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Annotated Data Preview ==== | ||
+ | Each of the examples below contains a multiple annotated set of 50 corpus concordances with manual disagreement analysis and final adjudication. The adjudicated data are used as a gold standard sample. | ||
+ | * cool - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_analysis.pdf|pdf]] | ||
+ | * deny - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_analysis.pdf|pdf]] | ||
+ | * yield - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_analysis.pdf|pdf]] | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Disagreement Analysis Preview ==== | ||
+ | Confusion matrices for each pair of annotators and automatic disagreements analysis: | ||
+ | * cool - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_res.txt|txt]] | ||
+ | * deny - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_res.txt|txt]] | ||
+ | * yield - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_res.txt|txt]] | ||
- | ====== Data ====== | ||
- | * Manual for Annotators - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/annotation_manual.pdf|pdf]] | ||
- | * Specification of form for defining patterns - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/PDEV2.1-pattern-form.pdf|pdf]] | ||
- | * Pattern definitions - we have revised the pattern definitions for 30 verbs. Here is the sample of three of them (after revision): | ||
- | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_patterns.html|cool]] | ||
- | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_patterns.html|deny]] - detailed [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_7.png|view]] of pattern number 7 | ||
- | * [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_patterns.html|yield]] | ||
- | * Annotation of 50 concordances per each of these verbs by three annotators. Here is a little sample of three verbs. There has been already finished the manual disagreements analysis and all instances were adjudicated and gold sample was created. | ||
- | * Manual disagreements analysis and adjudication: | ||
- | * cool - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_analysis.pdf|pdf]] | ||
- | * deny - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_analysis.pdf|pdf]] | ||
- | * yield - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_analysis.pdf|pdf]] | ||
- | * Automatic disagreements analysis - confusion matrix for each pair of the annotators in one file | ||
- | * cool - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/cool_res.txt|txt]] | ||
- | * deny - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/deny_res.txt|txt]] | ||
- | * yield - [[http://ufal.mff.cuni.cz/~smejkalova/pdev/yield_res.txt|txt]] | ||
- | * Inter-annotator agreement (Cohen's kappa for each pair, Fleiss' kappa for all together | ||
- | Results of inter-annotator agreement | ||
- | ^verb ^ size ^ #N ^ Fleiss' kappa ^ Cohen's kappa ^^^ | ||
- | | | | | | A2 vs. A3 | A2 vs. A1 | A3 vs. A1 | | ||
- | |cool | 50| 16| 0.685| 0.743| 0.669| 0.646| | ||
- | |deny | 50| 10| 0.524| 0.434| 0.571| 0.582| | ||
- | |yield | 50| 10| 0.500| 0.489| 0.588| 0.429| | ||