Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
draft [2009/07/15 12:44] ptacek |
draft [2009/07/15 15:13] ptacek |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Progress Report ====== | ====== Progress Report ====== | ||
+ | [[Progress Report]] - dal jsem to na zvlastni stranku, abysme si nelezli do zeli | ||
- | Hi Marc, | ||
- | ... | ||
- | |||
- | Re: progress: there is progress in the following: | ||
- | |||
- | - speech re-training for the collected dialogue data | ||
- | - additional dialogue transription for ASR is ongoing (WP52.? T5.2.1) | ||
- | - DM has been transferred from USFD to Prague (WP5.3) | ||
- | being extensively tested | ||
- | - DAF editor transfer is complete (WP5.3) | ||
- | - Sample dialogues (specifically aimed at the demo) | ||
- | are ready - issues are being resolved between CU/ZCU | ||
- | - DAFs are being prepared for the SC-CZ scenario AND | ||
- | the sample dialogues | ||
- | - DA set is being prepared, also based on the sample dialogues (WP5.2) | ||
- | - preliminary DA tagger (on std DAMSL-SWBD tagset) working (~35% error rate) (WP5.2) | ||
- | - integration work is ongoing (CU/ZCU, internally at CU) | ||
- | but no functioning full demo yet (beyond what we've presented in Madrid) | ||
- | |||
- | I hope this is OK for the progress report. Pavel (I.) might add more specifics regarding the ASR and especially TTS progress. | ||
- | |||
- | Best, | ||
- | |||
- | -- Jan | ||
====== Description of Czech Companion November Prototype ====== | ====== Description of Czech Companion November Prototype ====== | ||
- | The Czech version of Companion deals with the Reminiscing about User's Photos scenario. | + | The Czech version of Companion deals with the Reminiscing about User's Photos scenario |
+ | however the set of modules differs (see Figure 1). Regarding the physical settings: the Czech version runs on two notebook computers connected by local network; one can be seen as a Speech Client, running modules dealing with ASR,TTS and ECA, second as an NLP Server. | ||
photopal domena, nahranej korpus, ze na to sou dafy (reusing SHEFF DM intergrated through Inamode Relayer (TID)) vhodny, moreover reusable for expected pomdp DM from UOX (reuse states, let pomdp' | photopal domena, nahranej korpus, ze na to sou dafy (reusing SHEFF DM intergrated through Inamode Relayer (TID)) vhodny, moreover reusable for expected pomdp DM from UOX (reuse states, let pomdp' | ||
- | typy odpovedi a zpusob jejich implementace, | + | typy odpovedi a zpusob jejich implementace, |
NLP server s tectomt, ASR/TTS/SR client, connected over network | NLP server s tectomt, ASR/TTS/SR client, connected over network | ||
XXX JPta | XXX JPta | ||
Line 40: | Line 19: | ||
- | ===== Speech Reconstruction ===== | ||
- | features: omit filler phrases, irrelevant speech events, false starts, repetitions, | ||
- | imlementation(zahrnout tuhle info?): moses natrenovany na korpusu | ||
- | performance indicator: BLEU score (overall scoring of all features) to annotated corpora from T5.2.1., nejaka baseline | ||
- | XXX Mirek | ||
- | ===== Morphology Analyzer and POS tagging ===== | ||
- | features: XXX Mirek/ | ||
- | performance indicator: accuracy | ||
- | ===== Syntactic Parsing ===== | + | |
+ | |||
+ | ===== Automatic Speech Recognition (WP 5.1)===== | ||
+ | features: improved language models, real-time speaker adaptation | ||
+ | performance indicator: WER | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Speech Reconstruction (WP 5.1 ???) ===== | ||
+ | features: omit filler phrases, remove irrelevant speech events, handle false starts, repetitions, | ||
+ | performance indicator: BLEU score between actual output and manually reconstructed sentences from corpora (T5.2.1), baseline: Moses with default settings | ||
+ | |||
+ | |||
+ | ===== Morphology Analyzer and POS tagging (WP 5.2) ===== | ||
+ | features: coverage of photo-pal domain(PRIDA NAM JARKA SLOVA CO NAJDEME? | ||
+ | performance indicator: OOV rate, accuracy | ||
+ | |||
+ | ===== Syntactic Parsing | ||
features: induce dependencies and labels | features: induce dependencies and labels | ||
performance indicator: f-measure | performance indicator: f-measure | ||
Line 57: | Line 48: | ||
- | ===== Semantic Parsing ===== | + | ===== Semantic Parsing |
features: meaning representation with semantic roles (69 roles), coordinations, | features: meaning representation with semantic roles (69 roles), coordinations, | ||
performance indicator: f-measure | performance indicator: f-measure | ||
- | ===== Information Extraction ===== | + | ===== Information Extraction |
features: template based identification of predicates | features: template based identification of predicates | ||
covering predicates from before-mentioned set of DAFs. | covering predicates from before-mentioned set of DAFs. | ||
Line 67: | Line 58: | ||
- | ===== Named Entities Recognition ===== | + | ===== Named Entities Recognition |
features: detect person names, geographical locations (organizations myslim nepotrebne) | features: detect person names, geographical locations (organizations myslim nepotrebne) | ||
performance indicator: f-measure | performance indicator: f-measure | ||
- | ===== Dialog Act Tagging ===== | + | ===== Dialog Act Tagging |
features: tagset derived from DAMSL-SWBD, DA is a key feature driving the decision, what to say next. | features: tagset derived from DAMSL-SWBD, DA is a key feature driving the decision, what to say next. | ||
performance indicator: accuracy | performance indicator: accuracy | ||
- | ===== Sentiment Analysis ===== | + | ===== Sentiment Analysis |
features: za tohle bych vydaval klasifikator, | features: za tohle bych vydaval klasifikator, | ||
performance indicator: f-measure | performance indicator: f-measure | ||
+ | |||
+ | |||
+ | |||
Line 84: | Line 78: | ||
===== Complete System Evaluation ===== | ===== Complete System Evaluation ===== | ||
T5.2.7 tohle zminuje, nick webb to pro nas asi neudela | T5.2.7 tohle zminuje, nick webb to pro nas asi neudela | ||
- | performance indicator: | + | performance indicator: |
- | ===== Dialog Manager ===== | + | ===== Dialog Manager |
features: reply types, using (language independed) predicates (prakticky to znamena, ze pojmenuju testy na prechodech v dafech anglicky) | features: reply types, using (language independed) predicates (prakticky to znamena, ze pojmenuju testy na prechodech v dafech anglicky) | ||
+ | Handmade DAF covering following topics: Person_Retired, | ||
performance indicator: rucni hodnoceni prijatelnosti vybrane akce | performance indicator: rucni hodnoceni prijatelnosti vybrane akce | ||
- | ===== Natural Language Generation ===== | + | ===== Natural Language Generation |
features: variations, underspecified input (dott format), emotional markup (natvrdo v dafech a templatech u hodnoticich vet) | features: variations, underspecified input (dott format), emotional markup (natvrdo v dafech a templatech u hodnoticich vet) | ||
performance indicator: BLEU score | performance indicator: BLEU score |