Control Prosody using Multi-Agent System

Kenji MATSUI, Kenta KIMURA, Alberto PÉREZ

Abstract


Persons who have undergone a laryngectomy have a few options to partially restore speech but no completely satisfactory device. Even though the use of an electrolarynx (EL) is the easiest way for a patient to produce speech, it does not produce a natural tone and appearance is far from normal. Because of that and the fact that none of them are hands-free, the feasibility of using a motion sensor to replace a conventional EL user interface has been explored. A mobile device motion sensor with multi-agent platform has been used to investigate on/off and pitch frequency control capability. A very small battery operated ARM-based control unit has also been developed to evaluate the motion sensor based user-interface. This control unit is placed on the wrist and the vibration device against the throat using support bandage. Two different conversion methods were used for the forearm tilt angle to pitch frequency conversion: linear mapping method and F0 template-based method A perceptual evaluation has been performed with two well-trained normal speakers and ten subjects. The results of the evaluation study showed that both methods are able to produce better speech quality in terms of the naturalness.

Keywords


Prosody; Electrolarynx; Hands-free; Multi-agent system; Agents

Full Text:

PDF

References


SECOM company Ltd., Electrolarynx "MY VOICE", [http://www.secom.co.jp/personal/medical/myvoice.html]. Accessed in October 2013.

Griffin laboratories. Instruction Manuals, [http://www.griffinlab.com/Help.html]. Accessed in November 2013. Y. Kikuchi, and H. Kasuya: "Development and evaluation of pitch adjustable electrolarynx", In SP-2004, 761-764, 2004.

H. Takahashi, M. Nakao, T. Ohkusa, Y. Hatamura, Y. Kikuchi, and K. Kaga, 2001. Pitch control with finger pressure for electrolaryngial or intra-mouth vibrating speech. Jp. J. Logopedics and Phoniatrics, 42(1), 1-8.

http://dx.doi.org/10.5112/jjlp.42.1

Y. Saikachi, "Development and Perceptual Evaluation of Amplitude- Based F0 Control in Electrolarynx Speech", Journal of Speech, Language, and Hearing Research Vol.52 1360-1369 October 2009.

http://dx.doi.org/10.1044/1092-4388(2009/08-0167)

N. Uemi, T. Ifukube, M. Takahashi and J. Matsushima, "Design of a new electrolarynx having a pitch control function", In Proceedings of 3rd IEEE International Workshop on Robot and Human Communication, RO-MAN p.198-203, Nagoya, Japan, July 18-20, 1994.

K. Nakamura, T. Toda, H. Saruwatari and K. Shikano, "The use of air- pressure sensor in electrolaryngeal speech enhancement", INTERSPEECH, p.1628-1631, Makuahari, Japan, Sept 26-30, 2010.

H. L. Kubert, "Electromyographic control of a hands-free electrolarynx using neck strap muscles", J Commun Disord. 2009 May-Jun;42(3):211-25.

http://dx.doi.org/10.1016/j.jcomdis.2008.12.002

S. Poslad, P . Buckle, R. Hadingham, The FIP A-OS agent platform: Open Source for Open Standards. In Proceedings of Autonomous Agents AGENTS-2000, Barcelona, 2000.

E. Argente, A. Giret, S. Valero, V. Julian, V. Botti, Survey of MAS Methods and Platforms focusing on organizational concepts. In: Vitria, J, Radeva, P and Aguilo, I (ed) Recent Advances in Artificial Intelligence Research and Development, Frontiers in Artificial Intelligence and Applications: 2004, pp. 309–316.

F.G. McCabe, K. L. Clark. APRIL—Agent Process Interaction Language. In Proceedings of the workshop on agent theories, architectures, and languages on Intelligent agents (ECAI-94), Michael J. Wooldridge and Nicholas R. Jennings (Eds.). Springer-Verlag New York, Inc., New York, NY, USA, 1995, 324-340.

CommonWell Project. (2010). [http://commonwell.eu/index.php]. Accessed in February 2014.

Monami project. (2010). [http://www.monami.info/]. Accessed in February 2014.

DISCA TEL. (2010). [http://www.imsersounifor.org/proyectodiscatel/]. Accessed in February 2014.

INREDIS. (2011). [http://www.inredis.es/]. Accessed in February 2014.

INCLUTEC. (2011). [http://www.idi.aetic.es/evia/es/inicio/contenidos/documentacion/documentacion_grupos_de_trabajo/contenido.aspx]. Accessed in February 2014. K. Matsui, et al., "Enhancement of Esophageal Speech using Formant Synthesis", Journal of Acoustical Society of Japan (E) 23, 2 pp.66-79, 2002.

C. Zato et al., "Platform for building large-scale agent-based systems" 2012 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), pp. 69-73, 17-18 May 2012.

http://dx.doi.org/10.1109/EAIS.2012.6232807

H. Fujisaki, In V ocal Physiology: V oice Production, Mechanisms and Functions, Raven Press, 1988.

K. Matsui, et al, "Development of Electrolarynx with Hands-Free Prosody Control", The Proc. of the 8th ISCA, pp.273-277, Aug.31, 2013.




DOI: http://dx.doi.org/10.14201/ADECAIJ20131





Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

Clarivate Analytics