Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
draft [2009/07/16 09:48] ptacek |
draft [2009/09/01 00:15] ufal |
||
---|---|---|---|
Line 4: | Line 4: | ||
- | + | [[http:// | |
====== Description of Czech Companion November Demonstrator ====== | ====== Description of Czech Companion November Demonstrator ====== | ||
Line 14: | Line 13: | ||
Our DAFs covering selected topics contain not only Companion replies mined from the corpora, but also new human-authored assessments, | Our DAFs covering selected topics contain not only Companion replies mined from the corpora, but also new human-authored assessments, | ||
+ | |||
+ | For a sample dialogue, see the Scenario Brief below. | ||
{{user: | {{user: | ||
Line 20: | Line 21: | ||
===== Automatic Speech Recognition (WP 5.1)===== | ===== Automatic Speech Recognition (WP 5.1)===== | ||
features: improved language models, real-time speaker adaptation | features: improved language models, real-time speaker adaptation | ||
- | performance indicator: WER | + | performance indicator: WER |
Line 26: | Line 27: | ||
+ | ===== Speech Reconstruction (WP 5.2) ===== | ||
+ | features: omit filler phrases, remove irrelevant speech events, handle false starts, repetitions, | ||
- | ===== Speech Reconstruction (WP 5.2) ===== | ||
- | features: omit filler phrases, remove irrelevant speech events, handle false starts, repetitions, | ||
- | performance indicator: BLEU score between actual output and manually reconstructed sentences from corpora (T5.2.1), baseline: Moses with default settings | ||
Line 41: | Line 41: | ||
===== Morphology Analyzer and POS tagging (WP 5.2) ===== | ===== Morphology Analyzer and POS tagging (WP 5.2) ===== | ||
features: coverage of photo-pal domain, domain adapted tagger | features: coverage of photo-pal domain, domain adapted tagger | ||
- | performance indicator: OOV rate, accuracy | + | performance indicator: OOV rate, accuracy |
Line 48: | Line 49: | ||
===== Syntactic Parsing (WP 5.2) ===== | ===== Syntactic Parsing (WP 5.2) ===== | ||
features: induce dependencies and labels | features: induce dependencies and labels | ||
- | performance indicator: accuracy (correctly induced edges, labels) | + | performance indicator: accuracy (correctly induced edges (84%, labels) |
Line 56: | Line 57: | ||
===== Semantic Parsing (WP 5.2) ===== | ===== Semantic Parsing (WP 5.2) ===== | ||
features: assignment of semantic roles (69 roles), coordinations, | features: assignment of semantic roles (69 roles), coordinations, | ||
- | performance indicator: accuracy (correctly induced edges, labels) | + | performance indicator: accuracy (correctly induced edges, labels) |
===== Information Extraction (WP 5.2) ===== | ===== Information Extraction (WP 5.2) ===== | ||
Line 88: | Line 90: | ||
manual creation of DAFs covering following topics: Person_retired, | manual creation of DAFs covering following topics: Person_retired, | ||
performance indicator: acceptability - manual evaluation of actions selected by DM | performance indicator: acceptability - manual evaluation of actions selected by DM | ||
+ | |||
Line 95: | Line 98: | ||
===== Natural Language Generation (WP 5.4) ===== | ===== Natural Language Generation (WP 5.4) ===== | ||
- | features: adding of functional words, morphological adjustments, | + | features: adding of functional words, morphological adjustments, |
performance indicator: BLEU score | performance indicator: BLEU score | ||
Line 103: | Line 106: | ||
features: emotions will be expressed implicitly, through the usage of communicative functions; new female voice database was recorded for this purposes | features: emotions will be expressed implicitly, through the usage of communicative functions; new female voice database was recorded for this purposes | ||
performance indicator: listening tests | performance indicator: listening tests | ||
+ | |||
Line 109: | Line 113: | ||
features: new Czech female voice with affective features will be integrated with the TID avatar | features: new Czech female voice with affective features will be integrated with the TID avatar | ||
performance indicator: subjective evaluation of the naturalness and the ability to convey emotions (small-scale, | performance indicator: subjective evaluation of the naturalness and the ability to convey emotions (small-scale, | ||
- | |||
- | --------------cut here---------------- | ||
- | ====== | + | ====== |
- | ===== Syntactic Parsing | + | C1 Dobrý den, jak se jmenujete? |
- | features: adapted to domain (McD trained on manual PDTSC trees) | + | H1 Marie. |
- | performance indicator: accuracy (correctly induced edges, labels) | + | |
+ | C2 Těší mě. Já jsem Pavla a chtěla bych Vás blíž poznat. Jsem ráda, že mi ukážete Vaše fotografie. | ||
- | ===== Sentiment Analysis | + | C2 Co je tohle? |
- | features: za tohle bych vydaval klasifikator, co rozhoduje | + | H2 To je moje rodina, rodiče, sestra, její manžel, moje dvě děti a její dvě děti. |
- | performance indicator: f-measure | + | H2 Byli u mě na Vánoce na návštěvě. |
+ | |||
+ | C3 Trávíte Vánoce takhle pohromadě? | ||
+ | H3 Teď už ne. | ||
+ | H3 Teď už všechny moje dětičky vylétaly z hnízda. | ||
+ | |||
+ | C4 Jak se jmenuje váš manžel? | ||
+ | H4 Manžel už nežije, jmenoval se Karel Šístek. | ||
+ | |||
+ | C5 Máte ráda Vánoce? | ||
+ | H5 Teď už ne. | ||
+ | H5 Měla jsem je ráda, když byly děti malé. | ||
+ | H5 Teď už to ale není ono. | ||
+ | |||
+ | C6 Takže teď trávíte Vánoce sama? (ask_on_detail-daf-based) | ||
+ | H6 Nejsem úplně sama, chodím každé Vánoce k dceři, ale už to není, jak to bývalo, když byly doma malé děti. | ||
+ | |||
+ | C7 Jak děti vyrostou, tak má člověk najednou příliš mnoho času. | ||
+ | C7 Kdo je kdo na té fotce? | ||
+ | H7 U stromečku je můj synovec, ta holčička vedle je moje dcera, chlapeček s hračkou je můj syn, vedle je sestra, její manžel a nad ním je můj otec. | ||
+ | |||
+ | C8 To jste ale velká rodina. | ||
+ | C8 Jak se jmenuje vaše sestra? (companion commenting last utterance | ||
+ | H8 Sestra se jmenuje Marie Havlová, její manžel je Vladimír, ten bohužel nedávno zemřel. | ||
+ | |||
+ | C9 To je smutné. | ||
+ | C9 Znala jste ho dobře? | ||
+ | H9 Velmi dobře, trávili jsme obě rodiny společně hodně času. | ||
+ | |||
+ | C10 Máte už nějaká vnoučata? | ||
+ | H10 Mám vnučku Natálku. | ||
+ | H10 Ona je takové živé děťátko, taková neposeda. | ||
+ | |||
+ | C11 Takže Natálka je neposeda. repeat-rephrase | ||
+ | C11 Máte ještě nějaké další vnučky nebo vnuky? | ||
+ | H11 Čekáme ještě další přírustek. | ||
+ | |||
+ | C12 Hlídáte je někdy? | ||
+ | H12 Hlídám dost často. | ||
+ | H12 Já mám děti ráda a jsem s nimi moc ráda. | ||
- | ===== Complete System Evaluation ===== | + | C13 To je fajn. |
- | T5.2.7 tohle zminuje, nick webb to pro nas asi neudela | + | C13 S dětmi je legrace. |
- | performance indicator: number of tokens in user reply utterances, post-session questionare | + | H13 Ano. |
+ | C Podíváme se na další fotku? | ||
- | ===== advances ===== | ||
- | advances in Czech NLU (on reconstructed spoken data): 300-500vet(? | ||
- | pos ? analyzovat, generovat a kontrolovat ' |