Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
draft [2009/07/14 16:12] ptacek |
draft [2009/09/30 18:38] ptacek |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Description | + | ====== |
- | photopal domena, nahranej korpus, ze na to sou dafy (reusing SHEFF DM intergrated through Inamode Relayer | + | The Czech Companion follows the original idea of Reminiscing about the User's Photos, |
- | typy odpovedi | + | taking advantage of the data collected in the first phase of the project (using a Wizard-of-Oz setting). The full recorded corpora was transcribed, a manual speech reconstruction was done on 92.6% of utterances((Manual speech reconstruction is still in progress.)) and a pilot dialog acts annotation was performed on a sample of 1000 sentences. |
- | NLP server s tectomt, ASR/TTS/SR client, connected over network | + | |
- | XXX JPta | + | |
- | advances | + | The architecture is the same as in the English version, i.e. a set of modules communicating through the Inamode |
- | pos ? analyzovat, generovat a kontrolovat ' | + | |
+ | The NLU pipeline, DM, and NLG modules at the NLP Server are implemented using a CU's own TectoMT platform that provides access to a single in-memory data representation through a common API. This eliminates the overhead of a repeated serialization and XML parsing that an Inamode based solution would impose otherwise. | ||
- | ===== Speech Reconstruction ===== | + | The Knowledge Base consists of objects (persons, events, |
- | features: omit filler phrases, irrelevant speech | + | |
- | imlementation(zahrnout tuhle info?): moses natrenovany na korpusu | + | |
- | performance indicator: BLEU score (overall scoring | + | |
- | XXX Mirek | + | |
- | ===== Morphology Analyzer and POS tagging ===== | ||
- | features: XXX Mirek/ | ||
- | performance indicator: accuracy | ||
- | |||
- | ===== Syntactic Parsing ===== | ||
- | features: induce dependencies and labels | ||
- | performance indicator: f-measure | ||
- | v tipu je natrenovat MacDonnalda na dialog datech, ten task je do M42, ze bysme | ||
- | |||
- | |||
- | ===== Semantic Parsing ===== | ||
- | features: meaning representation with semantic roles (69 labels), coordinations, | ||
- | performance indicator: f-measure | ||
- | |||
- | ===== Information Extraction ===== | ||
- | features: template based identification of predicates | ||
- | covering predicates from before-mentioned set of DAFs. | ||
- | performance indicator: accuracy | ||
- | |||
- | ===== Named Entities Recognition ===== | ||
- | features: detect person names, geographical locations (organizations jsou potreba?) | ||
- | performance indicator: f-measure | ||
- | |||
- | ===== Dialog Act Tagging ===== | ||
- | features: tagset derived from DAMSL-SWBD, DA is a key feature driving | ||
- | performance indicator: | ||
- | |||
- | |||
- | ===== Sentiment Analysis ===== | ||
- | features: za tohle bych vydaval klasifikator, | ||
- | performance indicator: f-measure | ||
- | |||
- | |||
- | ===== Complete System Evaluation ===== | ||
- | T5.2.7 tohle zminuje, nick webb to pro nas asi neudela | ||
- | performance indicator: pocet slov ve vypovedich uzivatele(? | ||
- | |||
- | |||
- | |||
- | |||
- | ===== Dialog Manager ===== | ||
- | features: reply types, using (language independed) predicates (prakticky to znamena, ze pojmenuju testy na prechodech v dafech anglicky) | ||
- | performance indicator: rucni hodnoceni prijatelnosti vybrane akce | ||
- | |||
- | ===== Natural Language Generation ===== | ||
- | features: variations, underspecified input (dott format), emotional markup (natvrdo v dafech a templatech u hodnoticich vet) | ||
- | performance indicator: BLEU score |