[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:hladka:playcoref [2009/02/26 12:40]
hladka
user:hladka:playcoref [2009/03/02 08:38]
mirovsky
Line 77: Line 77:
  
 ====== Specification ====== ====== Specification ======
 +
  
  
Line 85: Line 86:
   * A game of two players. Players are paired randomly. Computer as a player: automatic coreference resolution **???????**   * A game of two players. Players are paired randomly. Computer as a player: automatic coreference resolution **???????**
   * Session time up to **???????** minutes.   * Session time up to **???????** minutes.
-  *  At the beginning of the game, if there is no coreference pair in the first two sentences (as determined by the manual/automatic pre-annotation), more than two sentences should be displayed, so many that at least one coreference pair occurs there. The players hook up the __nouns__ and __pronouns__ which refer to the same object independently of each other. If a player hooks up all the related words in the given sentence(s) (s)he keeps in mind then (s)he asks for the next sentence of the document. The session goes on this way until the end of the session time. (//vypustila jsem tu variantu, ze rychlejsi hrac muzi partii ukoncit kdykoli. Jednalo by se vlastne o znevyhodneni pomalejsiho hrace.//) The player who has asked for more sentences in the session obtains bonus speed points.+  *  At the beginning of the game, if there is no coreference pair in the first two sentences (as determined by the manual/automatic pre-annotation), more than two sentences should be displayed, so many that at least one coreference pair occurs there. The players hook up the __nouns__ and __pronouns__ which refer to the same object independently of each other. If a player hooks up all the related words in the given sentence(s) (s)he keeps in mind then (s)he asks for the next sentence(s) (depending on the number of pairs determined by the manual/automatic pre-annotation) of the document. The session goes on this way until the end of the session time. (//vypustila jsem tu variantu, ze rychlejsi hrac muzi partii ukoncit kdykoli. Jednalo by se vlastne o znevyhodneni pomalejsiho hrace.//) The player who has asked for more sentences in the session obtains bonus speed points.
    * What my partner is doing? If (s)he hooks up the same pair of words as I hooked up then the pair of words starts **???????**. If (s)he links a word I have not linked so far then a given word starts **???????**    * What my partner is doing? If (s)he hooks up the same pair of words as I hooked up then the pair of words starts **???????**. If (s)he links a word I have not linked so far then a given word starts **???????**
    * The players can re-hook up any word any time in the session.       * The players can re-hook up any word any time in the session.   
Line 135: Line 136:
  
 **BH**: Jirka ma pravdu. Pocitani skore musi byt objektivni. Proto jsem vzorecek upravila tak, ze nebude pocitat shodu hrace vzhledem k rucni anotaci. **BH**: Jirka ma pravdu. Pocitani skore musi byt objektivni. Proto jsem vzorecek upravila tak, ze nebude pocitat shodu hrace vzhledem k rucni anotaci.
- 
- 
- 
  
 ===== Output Data Needed ===== ===== Output Data Needed =====
    * score list ## //player_id, pts, #sessions//    * score list ## //player_id, pts, #sessions//
-   * documents after the ''n''-th session consist of ''2*n'' players coreference annotation (some of them should be identical, the more identical the better); how to calculate an inter-player agreement?+   * documents after the ''n''-th session consist of ''2*n'' players coreference annotation (some of them should be identical, the more identical the better); how to calculate an inter-player agreement? **BH:** v clanku, ktery budeme posilat na ACL, by mela byt seriozni uvaha o kvalite dat, ktere ziskame z her. Kvalita jde ruku v ruce s mezihracskou shodou a shodou mezi hracem a automatickou procedurou. **Pavle**, vzal by sis prosim tuto cast na starosti? Jiz jsem prochazela nejake prace a zatim mi z toho vychazi, ze je vhodne okomentovat: 
 +(**JM**: Mluvil jsem kvůli měření mezianotátorské shody v anotování koreference se Zdeňkem a vyšlo z toho, že na měření shody na šipkách by použil prostě jen F-measure. Kappa je nevhodná kvůli tomu, že pravděpodobnost náhodné shody je poměrně nízká a těžko se určuje; kappa se hodí spíš pro klasifikační úlohy (proto ji použiju v Anjiině projektu na shodu v určování typu koreference, když už se shodli na šipce); ostatní (G-theory a Pearson correlation) neznám, jsem zvědav, co k tomu řekne Pavel.) 
 +        - kappa measure 
 +        - G-theory - see [[http://en.wikipedia.org/wiki/Generalizability_theory|wiki]], [[http://www.aclweb.org/anthology-new/J/J07/J07-1002.pdf|Petra Saskia Bayerl; Karsten Ingmar Paul 
 +Identifying Sources of Disagreement: Generalizability Theory in Manual Annotation Studies]], Computational Linguistics, Volume 33, Number 1, March 2007. 
 +        - the Pearson correlation - see (Snow et al., 2008) [[http://ufal.mff.cuni.cz/~hladka/gwap/amt_emnlp08_accepted.pdf|Cheap and Fast - But is it Good? ... ]] 
    * session    * session
       * player_A_id, player_B_id       * player_A_id, player_B_id
Line 160: Line 163:
       * ...       * ...
  
 +===== Tools needed =====
 +   * tagger ## tool_chain (CAC2.0)
 +   * Linh's coreference resolution procedure **---PS TO DO---** What type of input data the Linh's procedure works with? ''tool_chain'' is going to be extended by the ''S'' option enabling to run Vasek Klimes' t-parser in a basic version, i.e. just t-tree and functors. See more info [[https://wiki.ufal.ms.mff.cuni.cz/user:hladka:playcoref#automaticke-urcovani-koreference-v-ceskych-datech-prehled]].
 +  * conversion: csts <-> pml m_coref scheme
  
  
Line 165: Line 172:
  
  
- +====== ACL - IJCNLP2009 ====== 
-===== Tools needed ===== +   [[http://www.acl-ijcnlp-2009.org/|Suntec Singapore, August 2-7, 2009]] 
-   tagger ## tool_chain (CAC2.0) +   * [[http://www.acl-ijcnlp-2009.org/main/callforpapers.html#shortpapers|Short papers]], deadline: April 26, 2009. Predposledni verze clanku musi byt hotova do 12. dubna. Nasledne clanek posleme vybranym kolegum, aby meli na precteni a okomentovani tyden. Nam pak bude zbyvat tyden do terminu
-   * Linh's coreference resolution procedure **---PS TO DO---** What type of input data the Linh's procedure works with? ''tool_chain'' is going to be extended by the ''S'' option enabling to run Vasek Klimes' t-parser in a basic version, i.e. just t-tree and functors. See more info [[https://wiki.ufal.ms.mff.cuni.cz/user:hladka:playcoref#automaticke-urcovani-koreference-v-ceskych-datech-prehled]]. +   pracovni adresar ''/net/work/projects/playlang/doc/ACL-IJCNLP2009/''
-  conversion: csts <-> pml m_coref scheme+

[ Back to the navigation ] [ Back to the content ]