A Survey of Multi-modal Emotion Recognition Based on Deep Learning

Authors

  • Muhan Jia
  • Zijian Sun

DOI:

https://doi.org/10.54097/37zncv36

Keywords:

Multi-modal, emotion recognition, deep learning, fusion methods.

Abstract

Multi-modal emotion recognition technology explores emotion recognition by integrating facial expression, voice intonation, text analysis and other multi-source data, so as to improve the naturalness and accuracy of human-computer interaction. Aiming at the emerging field of multi-modal emotion recognition, this paper introduces three single-modal emotion recognition methods of text, face and voice, especially the problem of multi-modal emotion fusion, and introduces the methods with high success rate of multi-modal emotion fusion recognition in recent years. Through comparative analysis, the conclusion is drawn that the current fusion methods are more complicated and the fusion success rate has been improved to some extent. However, the number of data sets on multi-modal emotion analysis is small, and the research on gesture and other modes of emotion recognition is also scarce. In the later stage, it is necessary to enrich the data set and add new modes to improve the accuracy and robustness of the multi-modal emotion recognition and analysis system.

Downloads

Download data is not yet available.

References

[1] Liu Y., Ai H., Zhang W. Multiple modal emotion recognition based on deep learning review. Journal of Xi'an University of Posts and Telecommunications, 2022, 27 (01): 60-71, 95.

[2] Wu J., Li W., Zhang Q., et al. Multimodal affective dialogue technology: Research review and development trend. Artificial Intelligence, 2024 (03): 45-56.

[3] Liu W., Qiu J.-L., Zheng W.-L., Lu B.-L. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14 (2): 715-729.

[4] Cheng D., Zhang D., Chen Y. Multimodal emotion recognition. Journal of Southwest University for Nationalities (Natural Science Edition), 2022, 48 (04): 440-447.

[5] Goyal R., Chaudhry N., Singh M. Personalized emotion detection from text using machine learning. In: 2022 3rd International Conference on Computing, Analytics and Networks (ICAN), Rajpura, Punjab, India, 2022, pp. 1-6.

[6] Dwijayanti S., Iqbal M., Suprapto B. Y. Real-time implementation of face recognition and emotion recognition in a humanoid robot using a convolutional neural network. IEEE Access, 2022, 10: 89876-89886.

[7] Savchenko A. V., Savchenko L. V., Makarov I. Classifying emotions and engagement in online learning based on a single facial expression recognition neural network. IEEE Transactions on Affective Computing, 2022, 13 (4): 2132-2143.

[8] Jiang P., Xu X., Tao H., Zhao L., Zou C. Convolutional-recurrent neural networks with multiple attention mechanisms for speech emotion recognition. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14 (4): 1564-1573.

[9] Liu W., Qiu J.-L., Zheng W.-L., Lu B.-L. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14 (2): 715-729.

[10] Suk. Game user oriented analysis of modal emotion recognition method research. University of Electronic Science and Technology, 2022.

[11] Liu Y. Multimodal scenario dialogue emotion recognition research. Huazhong University of Science and Technology, 2023.

Downloads

Published

11-12-2024

How to Cite

Jia, M., & Sun, Z. (2024). A Survey of Multi-modal Emotion Recognition Based on Deep Learning. Highlights in Science, Engineering and Technology, 119, 533-540. https://doi.org/10.54097/37zncv36