[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:hladka:playcoref [2009/02/26 12:13]
hladka
user:hladka:playcoref [2009/02/26 12:40]
hladka
Line 77: Line 77:
  
 ====== Specification ====== ====== Specification ======
 +
  
  
Line 87: Line 88:
    * What my partner is doing? If (s)he hooks up the same pair of words as I hooked up then the pair of words starts **???????**. If (s)he links a word I have not linked so far then a given word starts **???????**    * What my partner is doing? If (s)he hooks up the same pair of words as I hooked up then the pair of words starts **???????**. If (s)he links a word I have not linked so far then a given word starts **???????**
    * The players can re-hook up any word any time in the session.       * The players can re-hook up any word any time in the session.   
-   * To design the game for a particular language the following data and tools are needed (or are welcome):+   * To design the game for a particular language the following data and tools are needed (or, better said, are welcome):
      - corpus of manually anotated coreference      - corpus of manually anotated coreference
      - POS tagger      - POS tagger
      - coreference resolution procedure      - coreference resolution procedure
 +
 +
 +
  
  
Line 101: Line 105:
   * CS data   * CS data
      * Anja's data    ## // PDT data that are currently being annotated for the extended coreference //      * Anja's data    ## // PDT data that are currently being annotated for the extended coreference //
-     * **JM**: It would be nice if the players could choose a domain of the texts to play on (science-fiction, fantasy, thriller, romance, ...), maybe even the author or the very title. The available resources of free electronic books in Czech are scarce but there are plenty of free electronic books in English and other languages, e.g. [[http://www.gutenberg.org/wiki/Main_Page|Project Gutenberg]]. **BH**: It is a very nice idea but I would postpone it till the next versions of the PlayCoref game. However, we have already selected more 'user-friendlytexts into the LGame db - see [[http://ufallab2.ms.mff.cuni.cz/lgame/|this page]]. So we can use them for the PlayCoref game as well.  +     * **JM**: It would be nice if the players could choose a domain of the texts to play on (science-fiction, fantasy, thriller, romance, ...), maybe even the author or the very title. The available resources of free electronic books in Czech are scarce but there are plenty of free electronic books in English and other languages, e.g. [[http://www.gutenberg.org/wiki/Main_Page|Project Gutenberg]]. **BH**: It is a very nice idea but I would postpone it till the next versions of the PlayCoref game. However, we have already selected more user-friendly texts into the LGame db - see [[http://ufallab2.ms.mff.cuni.cz/lgame/|this page]]. So we can use them for the PlayCoref game as well.  
-      * **JM** TO DO+      * **---JM TO DO---** na datech od Anji zjistit pro nas zajimave statistiky typu 
 +vety/dokument; sipky_noun_noun-noun_pronoun-pronoun-pronoun/document; ... 
    * **EN**    * **EN**
       * search the data that are available       * search the data that are available
Line 129: Line 135:
  
 **BH**: Jirka ma pravdu. Pocitani skore musi byt objektivni. Proto jsem vzorecek upravila tak, ze nebude pocitat shodu hrace vzhledem k rucni anotaci. **BH**: Jirka ma pravdu. Pocitani skore musi byt objektivni. Proto jsem vzorecek upravila tak, ze nebude pocitat shodu hrace vzhledem k rucni anotaci.
 +
  
  
Line 134: Line 141:
 ===== Output Data Needed ===== ===== Output Data Needed =====
    * score list ## //player_id, pts, #sessions//    * score list ## //player_id, pts, #sessions//
-   * documents after the ''n''-th session consist of ''2*n'' players coreference annotation (some of them should be identical, the more identical the better)+   * documents after the ''n''-th session consist of ''2*n'' players coreference annotation (some of them should be identical, the more identical the better); how to calculate an inter-player agreement?
    * session    * session
       * player_A_id, player_B_id       * player_A_id, player_B_id
Line 152: Line 159:
       * arrows (**JM**: to avoid too many arrows on the screen, possibly only if the mouse pointer hovers over a word, arrows that start or end at the word would be displayed)       * arrows (**JM**: to avoid too many arrows on the screen, possibly only if the mouse pointer hovers over a word, arrows that start or end at the word would be displayed)
       * ...       * ...
 +
  
  
Line 160: Line 168:
 ===== Tools needed ===== ===== Tools needed =====
    * tagger ## tool_chain (CAC2.0)    * tagger ## tool_chain (CAC2.0)
-   * Linh's coreference resolution procedure **PS TO DO** What type of input data the Linh's procedure works with? ''tool_chain'' is going to be extended by the ''S'' option enabling to run Vasek Klimes' t-parser in a basic version, i.e. just t-tree and functors. See more info [[https://wiki.ufal.ms.mff.cuni.cz/user:hladka:playcoref#automaticke-urcovani-koreference-v-ceskych-datech-prehled]].+   * Linh's coreference resolution procedure **---PS TO DO---** What type of input data the Linh's procedure works with? ''tool_chain'' is going to be extended by the ''S'' option enabling to run Vasek Klimes' t-parser in a basic version, i.e. just t-tree and functors. See more info [[https://wiki.ufal.ms.mff.cuni.cz/user:hladka:playcoref#automaticke-urcovani-koreference-v-ceskych-datech-prehled]].
   * conversion: csts <-> pml m_coref scheme   * conversion: csts <-> pml m_coref scheme

[ Back to the navigation ] [ Back to the content ]