Measuring the differences between human-human and human-machine dialogs

Abstract

In this paper, we assess the applicability of user simulation techniques to generate dialogs which are similar to real human-machine spoken interactions.To do so, we present the results of the comparison between three corpora acquired by means of different techniques. The first corpus was acquired with real users.A statistical user simulation technique has been applied to the same task to acquire the second corpus. In this technique, the next user answer is selected by means of a classification process that takes into account the previous dialog history, the lexical information in the clause, and the subtask of the dialog to which it contributes. Finally, a dialog simulation technique has been developed for the acquisition of the third corpus. This technique uses a random selection of the user and system turns, defining stop conditions for automatically deciding if the simulated dialog is successful or not. We use several evaluation measures proposed in previous research to compare between our three acquired corpora, and then discuss the similarities and differences with regard to these measures.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Ai, H. and Litman, D., 2006. Comparing Real-Real, Simulated-Simulated, and Simulated-Real Spoken Dialogue
Corpora. In Procs. of AAAI Workshop Statistical and Empirical Approaches for Spoken Dialogue Systems.
Boston, USA.

Ai, H. and Litman, D., 2007. Knowledge Consistent User Simulations for Dialog Systems. In Proc. of
Interspeech’07, pages 2697–2700. Antwerp, Belgium.

Ai, H., Raux, A., Bohus, D., Eskenazi, M., and Litman, D., 2007a. Comparing Spoken Dialog Corpora Collected
with Recruited Subjects versus Real Users. In Proc. of the 8th SIGdial Workshop on Discourse and
Dialogue, pages 124–131. Antwerp, Belgium.

Ai, H., Tetreault, J., and Litman, D., 2007b. Comparing User Simulation Models For Dialog Strategy Learning.
In Proc. of NAACL HLT’07, pages 1–4. Rochester, NY, USA.

A.L. Ballinas, A. M. and Rangel, A., 2011. Multiagent System Applied to the Modeling and Simulation of
Pedestrian Traffic in Counterflow. Journal of Artificial Societies and Social Simulation, 14(3).

Angheluta, R., Busser, R. D., and Moens, M., 2002. The use of topic segmentation for automatic summarization.
In Proc. ACL Workshop on Automatic Summarization, pages 66–70.

Bailly, G., Raidt, S., and Elisei, F., 2010. Gaze, conversational agents and face-to-face communication. Speech
Communication, 52(6):598–612.

Balmer, M. and Nagel, K., 2006. Innovations in Design & Decision Support Systems in Architecture and Urban
Planning, chapter Shape Morphing of Intersection Layouts Using Curb Side Oriented Driver Simulation,
pages 167–183. Springer-Verlag.

Bandini, S., Celada, F., Manzoni, S., Puzone, R., and Vizzari, G., 2006. Modelling the Immune System with
Situated Agents. Lecture Notes in Computer Science, 3931:231–243.

Bandini, S., Manzoni, S., and Vizzari, G., 2009. Agent Based Modeling and Simulation: An Informatics
Perspective. Journal of Artificial Societies and Social Simulation, 12(4):97–126.

Bangalore, S., Fabbrizio, G. D., and Stent, A., 2008. Learning the Structure of Task-driven Human-Human
Dialogs. IEEE Trans Audio Speech Lang Processing, 16(7):1249–1259.

Bohus, D., Grau, S., Huggins-Daines, D., Keri, V., Krishna, G., Kumar, R., Raux, A., and Tomko, S., 2007.
Conquest - An Open-Source Dialog System for Conferences. In Proc. of 7th Meeting of the North American
Chapter of the Association for Computational Linguistics (HLT/NAACL’07), pages 9–12. Rochester, USA.

Bohus, D. and Rudnicky, A., 2002. LARRI: A Language-Based Maintenance and Repair Assistant. In Proc. of
Multi-Modal Dialogue in Mobile Environments Conference (IDS’02). Kloster Irsee, Germany.

Brahnam, S., 2009. Building Character for Artificial Conversational Agents: Ethos, Ethics, Believability, and
Credibility. PsychNology Journal, 7(1):9–47.

Chai, J. and Jin, R., 2004. Discourse structure for context question answering. In Proc. HLT-NAACL Workshop
on Pragmatics of Question Answering, pages 23–30.

Chung, G., 2004. Developing a flexible spoken dialog system using simulation. In Proc. of the 42nd Annual
Meeting of the Association for Computational Linguistics (ACL’04), pages 63–70. Barcelona, Spain.

Doran, C., Aberdeen, J., Damianos, L., and Hirschman., L., 2001. Comparing several aspects of humancomputer
and human-human dialogues. In Proc. SigDial.

Garcia, F., Hurtado, L., E.Sanchis, and Segarra, E., 2003. The incorporation of Confidence Measures to
Language Understanding. In Proc. of TSD’03, pages 165–172. Ceske Budejovice.

Glass, J., Flammia, G., Goodine, D., Phillips, M., Polifroni, J., Sakai, S., Seneff, S., and Zue, V., 1995.
Multilingual spoken-language understanding in the MIT Voyager system. Speech Communication, 17:1–
18.

Griol, D., Callejas, Z., López-Cózar, R., and Riccardi, G., 2014. A domain-independent statistical methodology
for dialog management in spoken dialog systems. Computer, Speech and Language, 28(3):743–768.

Griol, D., Molina, J., and Callejas, Z., 2012. Bringing together commercial and academic perspectives for the
development of intelligent AmI interfaces. JAISE, 4(3):183–207.

Hearst, M., 1994. Multi-paragraph segmentation of expository text. In Proc. ACL, pages 9–16.

Heath, B., Hill, R., and Ciarallo, F., 2009. A Survey of Agent-Based Modeling Practices (January 1998 to July
2008). Journal of Artificial Societies and Social Simulation, 12(4).

Heinroth, T. and Minker, W., 2012. Introducing Spoken Dialogue Systems into Intelligent Environments. Kluwer
Academic PublishersSpringer-Verlag.

Lin, B. and Lee, L., 2001. Computer aided analysis and design for spoken dialogue systems based on quantitative
simulations. IEEE Trans. Speech Audio Process, 9(5):534–548.

Lopez-Cozar, R., la Torre, A. D., Segura, J., Rubio, A., and Sanchez, V., 2003. Assessment of dialogue systems
by means of a new simulation technique. Speech Communication, 40(3):387–407.

Macal, C. and North, M., 2010. Tutorial on agent-based modelling and simulation. Journal of Simulation,
4:151–162.

Melin, H., Sandell, A., and Ihse, M., 2001. CTT-bank: A speech controlled telephone banking system - an initial
evaluation. In TMH Quarterly Progress and Status Report (TMH-QPSR), volume 1, pages 1–27.

Menezes, P., Lerasle, F., Dias, J., and Germa, T., 2007. Humanoid Robots, Human-like Machines, chapter
Towards an Interactive Humanoid Companion with Visual Tracking Modalities, pages 48–78. Advanced
Robotic Systems Int. and I-Tech Education and Publishing.

Moller, S., Englert, R., Engelbrecht, K., Hafner, V., Jameson, A., Oulasvirta, A., Raake, A., and Reithinger, N.,
2006. MeMo: towards automatic usability evaluation of spoken dialogue services by user error simulations.
In Proc. of the 9th Int. Conference on Spoken Language Processing (Interspeech/ICSLP), pages 1786–1789.
Pittsburgh, USA.

Navarro, L., Flacher, F., and Corruble, V., 2011. Dynamic Level of Detail for Large Scale Agent-Based
Urban Simulations. In Proc. of the 10th Int. Conference on Autonomous agents and multiagent systems
(AAMAS’11), pages 701–708. Taipei, Taiwan.

Passoneau, R. and Litman, D., 1997. Discourse segmentation by human and automated means. Computational
Linguistics, 23:103–139.

Pavón, J., Sansores, C., Gómez, J., and Wang, F., 2008. Modelling and simulation of social systems with
INGENIAS. Int. Journal of Agent-Oriented Software Engineering, 2(2).

Pérez-Marín, D. and Pascual-Nieto, I., 2011. Conversational Agents and Natural Language Interaction:
Techniques and Effective Practices. IGI Global.

Pieraccini, R., 2012. The Voice in the Machine: Building Computers that Understand Speech. The MIT Press.

Ponte, J. and Croft, W., 1997. Text segmentation by topic. In Proc. ECDL, pages 120–129.

Schatzmann, J., Georgila, K., and Young, S., 2005. Quantitative Evaluation of User Simulation Techniques for

Spoken Dialogue Systems. In Proc. of the 6th SIGdial Workshop on Discourse and Dialogue, pages 45–54.
Lisbon, Portugal.

Schatzmann, J., Thomson, B., and Young, S., 2007. Error Simulation for Training Statistical Dialogue Systems.
In Proc. of ASRU’07, pages 526–531.

Schatzmann, J., Weilhammer, K., Stuttle, M., and Young, S., 2006. A Survey of Statistical User Simulation
Techniques for Reinforcement-Learning of Dialogue Management Strategies. Knowledge Engineering
Review, 21(2):97–126.

Stepanov, E., Riccardi, G., and Bayer, A., 2014. The Development of the Multilingual LUNA Corpus for Spoken
Language System Porting. In Proc. LREC, pages 2675–2678.

Turunen, M., Hakulinen, J., and Kainulainen, A., 2006. Evaluation of a Spoken Dialogue System with
Usability Tests and Long-term Pilot Studies: Similarities and Differences. In Proc. of the 9th International
Conference on Spoken Language Processing (Interspeech/ICSLP), pages 1057–1060. Pittsburgh, USA.

Vaquero, C., Saz, O., Lleida, E., Marcos, J., and Canal?s, C., 2006. VOCALIZA: An application for computeraided
speech therapy in spanish language. In Proc. IV Jornadas en Tecnolog?a del Habla, pages 321–326.
Zaragoza, Spain.

Walker, M. A., 1998. Centering, anaphora resolution, and discourse structure, pages 401–435. Oxford
University Press.

Weng, F., Varges, S., Raghunathan, B., Ratiu, F., Pon-Barry, H., Lathrop, B., Zhang, Q., Scheideck, T., Bratt, H.,

Xu, K., Purver, M., Mishra, R., Raya, M., Peters, S., Meng, Y., Cavedon, L., and Shriberg, L., 2006. CHAT:
A Conversational Helper for Automotive Tasks. In Proc. of the 9th Int. Conference on Spoken Language
Processing (Interspeech/ICSLP), pages 1061–1064. Pittsburgh, USA.

Weyns, D., Boucké, N., and Holvoet, T., 2006. Gradient Field-Based Task Assignment in an AGV Transportation
System. In Proc. of the 5th Int. Conference on Autonomous agents and multiagent systems (AAMAS’06),
pages 842–849. Hakodate, Japan.

Windrum, P., Fagiolo, G., and Moneta, A., 2007. Empirical Validation of Agent-Based Models: Alternatives
and Prospects. Journal of Artificial Societies and Social Simulation, 10(2).

Yamron, J., 1998. Topic detection and tracking segmentation task. In Proc. Broadcast News Transcription and
Understanding Workshop.

Zue, V., Seneff, S., Glass, J., Polifroni, J., Pao, C., Hazen, T., and Hetherington, L., 2000. JUPITER: A
telephone-based conversational interface for weather information. IEEE Transactions on Speech and Audio
Processing, 8(1):85–96.
Griol, D., & Molina, J. (2015). Measuring the differences between human-human and human-machine dialogs. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 4(2), 99–112. https://doi.org/10.14201/ADCAIJ20154299112

Downloads

Download data is not yet available.
+