From VoiceXML to multimodal mobile Apps: development of practical conversational interfaces

  • David Griol
    Universidad Carlos III de Madrid dgriol[at]inf.uc3m.es
  • Jose Manuel Molina
    Universidad Carlos III de Madrid

Abstract

Speech Technologies and Language Processing have made possible the development of a number of new applications which are based on conversational interfaces. In this paper, we describe two approaches to bridge the gap between the academic and industrial perspectives in order to develop conversational interfaces using an academic paradigm for dialog management while employing the industrial standards. The advances in these technologies have made possible to extend the initial applications of conversational interfaces from only spoken interaction (for instance, by means of VoiceXML-based systems) to multimodal services by means of mobile devices (for instance, using the facilities provided by the Android OS). Our proposal has been evaluated with the successful development of different spoken and multimodal conversational interfaces.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Baker, J., Deng, L., Glass, J., Khudanpur, S., Lee, C., Morgan, N., and O’Shaughnessy, D., 2009. Developments and directions in speech recognition and understanding. IEEE Signal Processing Magazine, 26(3):75–80.

Bickmore, T., Puskar, K., Schlenk, E., Pfeifer, L., and Sereika, S., 2010. Maintaining reality: Relational agents for antipsychotic medication adherence. Interacting with Computers, 22:276–288.

Bohus, D. and Rudnicky, A., 2003. RavenClaw: Dialog management using hierarchical task decomposition and an expectation agenda. In Proc. of 8th European Conference on Speech Communication and Technology (Eurospeech’03), pages 597–600. Geneva, Switzerland.

Corchado, J., Tapia, D., and Bajo, J., 2008. A multi-agent architecture for distributed services and applications. Computational Intelligence, 24(2):77–107.

DARPA, 1992. Speech and Natural Language Workshop. In Book of Proceedings. San Mateo.

Dybkjaer, L. and Minker, W., 2008. Recent Trends in Discourse and Dialogue. Springer.

González-Ferreras, C., Escudero, D., and Cardoso, V., 2006. From HTML to VoiceXML: A First Approach. LNCS, 2448:266–279.

Griol, D., Callejas, Z., López-Cízar, R., and Riccardi, G., 2014. A domain-independent statistical methodology for dialog management in spoken dialog systems. Computer, Speech and Language, 28(3):743–768.

Griol, D., Carbó, J., and Molina, J., 2013. A statistical simulation technique to develop and evaluate conversational agents. AI Communication, 26(4):355–371.

Griol, D., Hurtado, L., Segarra, E., and Sanchis, E., 2008. A Statistical Approach to Spoken Dialog Systems Design and Evaluation. Speech Communication, 50(8-9):666–682.

Griol, D., Sánchez-Pi, N., Carbó, J., and Molina, J., 2011. An Agent-Based Dialog Simulation Technique toDevelop and Evaluate Conversational Agents. Advances in Intelligent and Soft Computing (PAAMS’11),88:255–264.

Hofmann, H., Silberstein, A., Ehrlich, U., Berton, A., Muller, C., and Mahr, A., 2014. Natural Interaction with Robots, Knowbots and Smartphones: Putting Spoken Dialog Systems into Practice, chapter Development of Speech-Based In-Car HMI Concepts for Information Exchange Internet Apps, pages 15–28. Springer.

Horchak, O., Giger, J.-C., Cabral, M., and Pochwatko, G., 2014. From demonstration to theory in embodied language comprehension: A review. Cognitive Systems Research, 29-30:66–85.

Kopp, K., Britt, M., Millis, K., and Graesser, A., 2012. Improving the efficiency of dialogue in tutoring. Learning and Instruction, 22(5):320–330.

McTear, M. and Callejas, Z., 2013. Voice Application Development for Android. Packt Publishing.

McTear, M. F., Callejas, Z., and Griol, D., 2016. The Conversational Interface. Springer.

Metze, F., Anguera, X., Barnard, E., Davel, M., and Gravier, G., 2014. Language independent search in MediaEval’s Spoken Web Search task. Computer, Speech and Language, 28(5):1066–1082.

Minker, W., 1998. Stochastic versus rule-based speech understanding for information retrieval. Speech Communication, 25(4):223–247.

Minker, W., Heinroth, T., Strauss, P., and Zaykovskiy, D., 2010. Human-Centric Interfaces for Ambient Intelligence, chapter Spoken Dialogue Systems for Intelligent Environments, pages 453–478. Elsevier.

Misu, T., Raux, A., Gupta, R., and Lane, I., 2015. Situated language understanding for a spoken dialog system within vehicles. Computer Speech and Language, 34:186–200.

Peckham, J., 1993. A new generation of spoken dialogue systems: results and lessons from the SUNDIAL project. In Proc. of 3rd European Conference on Speech Communication and Technology (Eurospeech’93), pages 33–42. Berlin, Germany.
Pieraccini, R., 2012. The Voice in the Machine: Building computers that understand speech. MIT Press.

Rabiner, L. and Juang, B., 1993. Fundamentals of Speech Recognition. Prentice Hal.
Reschke, K., Vogel, A., and Jurafsky, D., 2013. Generating Recommendation Dialogs by Extracting Information from User Reviews. In Proc. of ACL’13, pages 499–504.

Rouillard, J., 2007. Web services and speech-based applications around VoiceXML. Journal of Networks, 2(1):27–35.

Stent, A., Stenchikova, S., and Marge, M., 2006. Reinforcement learning of dialogue strategies with hierarchical abstract machines. In Proc. of SLT’06, pages 210–213.

Traum, D. and Larsson, S., 2003. The Information State Approach to Dialogue Management, chapter Current and New Directions in Discourse and Dialogue, pages 325–353. Kluwer.

Tsai, M., 2005. The VoiceXML dialog system for the e-commerce ordering service. In Proc. of CSCWD’05, pages 95–100.

Weizenbaum, J., 1966. ELIZA - A computer program for the study of natural language communication between man and machine. Communications of the ACM, 9:36–45.
Griol, D., & Molina, J. M. (2016). From VoiceXML to multimodal mobile Apps: development of practical conversational interfaces. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 5(3), 43–53. https://doi.org/10.14201/ADCAIJ2016534353

Downloads

Download data is not yet available.
+