Both sides previous revision
Previous revision
Next revision
|
Previous revision
Next revision
Both sides next revision
|
user:hladka:playcoref [2009/03/09 10:37] hladka |
user:hladka:playcoref [2009/03/10 10:18] hladka |
- POS tagger | - POS tagger |
- coreference resolution procedure | - coreference resolution procedure |
| |
| |
| |
| |
| |
| |
* Anja's data ## // PDT data that are currently being annotated for the extended coreference // | * Anja's data ## // PDT data that are currently being annotated for the extended coreference // |
* **JM**: It would be nice if the players could choose a domain of the texts to play on (science-fiction, fantasy, thriller, romance, ...), maybe even the author or the very title. The available resources of free electronic books in Czech are scarce but there are plenty of free electronic books in English and other languages, e.g. [[http://www.gutenberg.org/wiki/Main_Page|Project Gutenberg]]. **BH**: It is a very nice idea but I would postpone it till the next versions of the PlayCoref game. However, we have already selected more user-friendly texts into the LGame db - see [[http://ufallab2.ms.mff.cuni.cz/lgame/|this page]]. So we can use them for the PlayCoref game as well. | * **JM**: It would be nice if the players could choose a domain of the texts to play on (science-fiction, fantasy, thriller, romance, ...), maybe even the author or the very title. The available resources of free electronic books in Czech are scarce but there are plenty of free electronic books in English and other languages, e.g. [[http://www.gutenberg.org/wiki/Main_Page|Project Gutenberg]]. **BH**: It is a very nice idea but I would postpone it till the next versions of the PlayCoref game. However, we have already selected more user-friendly texts into the LGame db - see [[http://ufallab2.ms.mff.cuni.cz/lgame/|this page]]. So we can use them for the PlayCoref game as well. |
* **JM**: Predelal jsem data pro playcoref, ted obsahuji jenom koreference mezi uzly s tagy N nebo P. Data jsou v adresari: ''/net/work/projects/playlang/playcoref/data/02_bridging_playcoref/train-1''. Spocital jsem tabulku, ve ktere jsou tyto soubory z train-1 serazeny sestupne podle pomeru (pocet koref. sipek)/(pocet slov). Tabulka je tady: | ***JM (6/3/09)**: Predelal jsem data pro playcoref, ted obsahuji jenom koreference mezi uzly s tagy N nebo P. Data jsou v adresari: ''/net/work/projects/playlang/playcoref/data/02_bridging_playcoref/train-1''. Spocital jsem tabulku, ve ktere jsou tyto soubory z train-1 serazeny sestupne podle pomeru (pocet koref. sipek)/(pocet slov). [[http://ufal.mff.cuni.cz/~hladka/PlayCoref/_text_coref_proportions.txt|Tabulka je tady]] ( prvni sloupec je pomer (pocet koref. sipek)/(pocet slov), druhy sloupec je nazev souboru, treti sloupec je pocet koref. sipek, ctvrty sloupec je pocet slov.) |
''/net/work/projects/playlang/playcoref/data/02_bridging_playcoref/train-1/_text_coref_proportions.txt'' ( prvni sloupec je pomer (pocet koref. sipek)/(pocet slov), druhy sloupec je nazev souboru, treti sloupec je pocet koref. sipek, ctvrty sloupec je pocet slov.) | |
* **EN** | * **EN** |
* search the data that are available | * search the data that are available |
* sentence by sentence | * sentence by sentence |
* supervised selection of documents for a session | * supervised selection of documents for a session |
| |
| |
| |
| |
===== Scoring ===== | ===== Scoring ===== |
* ''pts_of_player_A = w1*(player_A's_output vs. automatic_annotation) + w1*(player_A's_output vs. player_B's_output) + speed_pts'' | * ''pts_of_player_A = w1*(player_A's_output vs. automatic_annotation) + w2*(player_A's_output vs. player_B's_output) + speed_pts'' |
| |
**JM**: | **JM**: |
* arrows (**JM**: to avoid too many arrows on the screen, possibly only if the mouse pointer hovers over a word, arrows that start or end at the word would be displayed) | * arrows (**JM**: to avoid too many arrows on the screen, possibly only if the mouse pointer hovers over a word, arrows that start or end at the word would be displayed) |
* ... | * ... |
| |
| |
| |
| |
===== Tools needed ===== | ===== Tools needed ===== |
* tagger ## tool_chain (CAC2.0) | * tagger ## tool_chain (CAC2.0) |
* Linh's coreference resolution procedure **---PS TO DO---** What type of input data the Linh's procedure works with? ''tool_chain'' is going to be extended by the ''S'' option enabling to run Vasek Klimes' t-parser in a basic version, i.e. just t-tree and functors. See more info [[https://wiki.ufal.ms.mff.cuni.cz/user:hladka:playcoref#automaticke-urcovani-koreference-v-ceskych-datech-prehled]]. | * Linh's coreference resolution procedure - see TectoMT - **JM** |
* conversion: csts <-> pml m_coref scheme | * vyzkouset - trenink a test - na datech Anji |
| * conversion: csts <-> pml m_coref scheme |
| |
| |