This is an old revision of the document!
In this talk I will discuss issues related to information presentation in an interactive system,
Wikitalk (Wilcock & Jokinen, 2011). This supports open-domain conversations using Wikipedia
as a knowledge base, and it has been implemented on Nao a spoken dialogue system. The novel
feature in the system is that by extending the robot’s interaction capabilities by enabling Nao to
talk about an unlimited range of topics.
I will focus especially on how to present new information in a manner that allows the user to
follow the presentation. The user can query Wikipedia via the Nao robot and have chosen entries
read out by the robot. In a text-free environment the user needs to infer the structure of the article
from the robot’s output - Wikipedia entries are large blocks of text which can be very
monotonous when simply read out by a synthetic voice, and comprehension could be enhanced
by adding non-verbal cues to discourse level organization of the text. In Wikipedia relevant
information is marked with hyperlinks to other entries. A system where the robot could signal
these links non-verbally while reading the text would allow the user to further query the
encyclopedia without recourse to explicit menus.
The articles are considered as possible Topics that the robot can talk about, while each link in the
article is treated as new information that the user can shift their attention to, and ask for more
information. The paragraphs and sentences in the article are considered as propositional chunks,
i.e. pieces of information that structure the topic into subtopics and form the minimal units for
presentation, i.e. they can be presented in one 'utterance' by the robot.
The challenge in presenting the Wikipedia information is how to convey its structure to the user
so that she can understand which are the new information links, and how to navigate in the topic
structure smoothly. In dialogue management, topics are usually managed by a stack, which
allows a convenient last-in-first-out mechanism to handle topics that have been recently talked
about. We use topic trees (cf. McCoy and Cheng 1990, Jokinen et al. 1996) in which topics are
structured into a tree that enables more flexible management of the recent topics.
Moreover, we use the concepts of Topic and NewInfo (Jokinen 2009) where Topic refers to the
particular issue (Wiki-article) that the speakers are talking about, and NewInfo is the part of the
message that is new in the context of the current Topic (links). It must be emphasized that the
dialogue coherence, i.e. the relation between consecutive utterances being such that the listener
can readily understand what their connection is, appears straightforward: we can rely on the
structure of the Wikipedia to provide coherence for us. As the Wikipedia articles have already
been written so that they form a coherent text, we take advantage of this and assume that the
content of the topics and possible NewInfo links is coherent. Meaningfulness of the interaction is
based on the user's interest rather than a particular task structure that would limit the suitable
topics for the interaction.
However, what is important in our case is to capture the speakers' attentional state in such a way
that the user can focus their attention to NewInfo. We have experimented with various gestures to
mark the NewInfo and to provide structuring for the WikiTalk presentation. Gesture and posture
changes could also be used to help manage turntaking in Nao’s dialogue, while the inclusion of
gesture in Nao’s conversational repertoire would also enhance expressivity and add liveliness to
the interaction. We identified a set of gestures which could be used to:
• Mark discourse level details such as paragraph and sentence boundaries.
• Indicate hyperlinks
• Help manage turntaking
• Add expressivity or liveliness
KEYWORDS: discourse and dialogue structuring, topic trees, new information, Wikipedia,
feedback, turn-taking