ارائه یک الگوریتم جدید برای تشخیص احساس گوینده برای تعامل انسان و ربات

نوع مقاله: مقاله علمی پژوهشی

نویسندگان

1 دانشجوی کارشناسی ارشد دانشگاه آزاد تهران جنوب

2 دانشکده مهندسی مکانیک، دانشگاه صنعتی خواجه‌ نصیرالدین طوسی، تهران، ایران

3 دانشگاه ازاد اسلامی واحد پردیس*مهندسی مکانیک

چکیده

آسان‌ترین ارتباط بین انسان و ماشین از طریق گفتار است و از ملزومات این ارتباط، درک احساس انسان توسط ماشین است. در الگوریتم پیشنهادی، با هدف افزایش سرعت و دقت در تشخیص، از ویژگی‌های سیگنال صدا، ضرایب کپسترال فرکانسی مل را استخراج کرده و ویژگی‌هایی بهینه انتخاب می‌شوند. سپس با استفاده از ترکیب طبقه‌بندهای ماشین‌بردار پشتیبان و مدل مخلوط گاوسی، تشخیص احساس انجام می‌شود. نتایج حاصل از پیاده‌سازی این الگوریتم برای پایگاه‌داده آلمانی 89% و برای پایگاه‌داده فارسی 68% بدست‌آمده. مقایسه نتایج این الگوریتم با الگوریتم‌های مشابه عملکرد مناسب الگوریتم را در طراحی سیستم‌های کنترل و هدایت ربات‌ها نشان می‌دهد.

کلیدواژه‌ها

موضوعات


[1]      Nicholson, J., Takahashi, K., and Nakatsu, M., "Emotion Recognition in Speech using Neural Networks", Neural Computation, Vol. 1, pp. 9290–9296, (2000).

 

[2]    Cowie, E., Campbell, N., Cowie, R., and Roach, P., "Emotional Speech: Towards a New Generation of Data Bases Original Research Article", Speech Communication, Vol. 40, pp. 33-60, (2003).

 

[3]      Banse, R., and Scherer, K., "Acoustic Profiles in Vocal Emotion Expression", J. Pers. Soc. Psychol. Vol. 70, No. 3, pp. 614–636, (1996).

 

[4]   Hozjan, V., and Kacic, Z., "Context-independent Multi Lingual Emotion Recognition from Speech Signal", Int. J. Speech Technol, Vol. 6, pp. 311–320, (2003).

 

[5]   Kleinginna, Jr., and Kleinginna, A.M., "A Categorized List of Emotion Definitions, with Suggestions for a Consensual Definition", Motivation Emotion, Vol. 5, No. 4, pp. 345–379, (1981).

[6]   Fernandez, R., "A Computational Model for the Automatic Recognition of Affect in Speech", Ph.D. Thesis, Massachusetts Institute of Technology, MIT Media Arts and Science, February, (2004).

 

[7]   Bradley, M. M., and Lang, P. J., "Measuring Emotion: The Self-assessment Manikin and the Semantic Differential", Journal of Behavior Therapy & Experimental Psychiatry, Vol.  25, No. 1, pp. 49-59, (1990).

 

[8]   Schubiger, M., "English in to Nation: its form and Function", Niemeyer, Tubingen, Germany, (1958).

 

[9]   O’Connor, J., and Arnold, G., "Intonation of Colloquial English", Seconded. Longman, London, UK, (1973).

 

[10]  Kim, D.H., "Fuzzy Rule Based Voice Emotion Control for user Demand Speech Generation of Emotion Robot", Computer Applications Technology (ICCAT), Germany, (2013).

 

[11]  Sudhkar, R., “Analysis of Speech Features for Emotion Detection: A Review”, International Conference on Computing Communication Control and Automation, Germany, (2015).

 

[12]    Gharsellaoui, S., Selouani, S., and Dahmane, A., "Automatic Emotion Recognition using Auditory and Prosodic Indicative Features",Proceeding of the IEEE 28th Canadian Conference on Electrical and Computer Engineering, Halifax, Canada, (2015).

 

[13]  Bosse, T., and Zwanenburg, E.," There's Always Hope: Enhancing Agent Believability Through Expectation-based Emotions", ACII 2009, 3rd International Conference, Amsterdam, Netherlands, pp. 1–8, (2009).

 

[14]  Hudlicka, E., and Broekens, J., "Foundations for Modeling Emotions in Game Characters: Modelling Emotion Effects on Cognition", ACII 2009, 3rd International Conference, (2009).

 

[15]  Castellano, G., Leite, I., Pereira, A., Martinho, C., Paiva, A., and McOwan, Peter, W., "It's All in the Game: Towards an Affect Sensitive and Context Aware Game Companion", ACII 2009, 3rd International Conference, Amsterdam, Netherlands, pp. 1–8, (2009).

 

[16]    Ling, He., Lech, M., Maddage, N., and Allen, N., "Emotion Recognition in Speech of Parents of Depressed Adolescents", Bioinformatics and Biomedical Engineering, ICBBE 2009, 3rd International Conference, Beijing, China,  pp. 1–4, June (2009).

 

[17]   Ververidis, D., Constantine, K., "Automatic Speech Classification to Five Emotional States Based on Gender Information", Signal Processing Conference, 12th European,  pp. 341-344, (2004).

 

[18]   Morrison, D., Wang, R., and Silva, L., "Ensemble Methods for Spoken Emotion Recognition in Call-centres Original Research Article", Speech Communication, Vol.  49, No. 2, pp. 98-112, (2007).

[19]    Pao, T.L., Liao, V.Y., Chen, Y.T., and Yeh, J.H, "Comparison of Several Classifiers for Emotion Recognition from Noisy Mandarin Speech", Third International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Kaohsiung City, Taiwan, November 26-28, pp. 21-26, (2007).

 

[20]    Yang, X.S., "A New Metaheuristic Bat-inspired Algorithm", Nature Inspired Cooperative Strategies for Optimization (NICSO), Studies in Computational Intelligence, Vol. 284, pp. 65-74, (2010).

 

[21]    Khan, K., and Sahai, A., "A Comparison of BA, GA, PSO, BP and LM for Training Feed forward Neural Networks in E-Learning Context", I.J. Intelligent Systems and Applications, Vol. 7, pp. 23-29, (2012).

 

[22]     Kuncheva, L.I., "Combining Pattern Classifiers: Methods and Algorithms", John Wiley & Sons, (2004).

 

[23]  Krogh, A., and Vedelsby, J., "Neural Network Ensembles, Cross Validation and Active Learning", Advances in Neural Information Processing Systems, Vol. 7, pp. 85-91, (1995).

 

[24]    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., and Weiss, B., "A Database of German Emotional Speech", Proceedings Inter Speech, Lissabon, Portugal, (2005).

 

[25]    Mervi, H., and Esmailian, Z., "Showing up New Data Base for Detect Emotion from Speech", (2013).

 

[26]    Esmailian, Z., and Marvi, H., "Recognition of Emotion in Speech using Variogram Based Features", Malaysian Journal of Computer Science, Vol. 27, No. 3, pp. 21-27, (2014).

 

[27]    Ayadi, M., Kamel, M.S., and Karray, F., "Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models", IEEE, (2007).