The Evolution and Future Development of the Q&A Dialogue System

Wenbo Sun

doi:10.54097/h90gh082

Authors

Wenbo Sun

DOI:

https://doi.org/10.54097/h90gh082

Keywords:

Dialogue system, Affective computing, Human computer interaction, Deep learning, Multimodal fusion.

Abstract

The development of artificial intelligence is very fast. Early conversational systems could only deal with simple tasks. Now modern systems understand our conversation and answer with the right emotion. It has a great significance for human-machine interaction. Their goal is not only to follow a rigid response to simple commands but to have a dialogue system that can understand the context and emotion. At the initial stage, the technology was rigid and rule-based with only fixed patterns like “if-then”. If the user’s question matches the pattern, then the answer can be delivered. Otherwise, the right answer is impossible. But when the computer became clever enough to learn from various information sources, it started understanding multiple big data sources. By stacking and mining big data, the latest achievement based on deep learning makes the conversation with machines even smarter and better at understanding us. Each stage has opened a new way to promote the development. Text sentiment recognition is a great achievement. For example, online machines can understand the mood of users’ questions and generate the right answer with the proper tone and sentiment through big data emotional matching. But it is still hard to reach the next level. It is hard for machines to understand the real meaning of conversation; questions may go out of the overall context and the solution to real-life problems is not complete. There will be performance issues when facing topics that have not been trained on. Future development should focus on exploring new ways to take more multi-dimensional information like sound and picture as input and to develop machines that can self-learning without huge data. Only by solving these problems can make the computer interaction more natural and more emotional.

References

[1] Chen H, Liu X, Yin D, Tang J. A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explorations Newsletter, 2017, 19 (2): 25 – 35.

[2] Wen T-H, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona L M, Su P H, Young S. A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 2017 SIGDIAL Conference on Discourse and Dialogue, 2017: 438 – 449.

[3] Bao S, He H, Wang F, Wu H, Wang H. PLATO: pre-trained dialogue generation model with discrete latent variable. arXiv, 2019. https://arxiv.org/abs/1910.07931.

[4] Ham D, Lee J G, Jang Y, Kim K E. End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34 (05): 78 – 85.

[5] Zhou L, Gao J, Li D, Shum H Y. The design and implementation of XiaoIce, an empathetic social chatbot. Journal of Artificial Intelligence Research, 2020, 67: 1 – 42. https://doi.org/10.1613/jair.1.12169.

[6] Budzianowski P, Wen T-H, Tseng B H, Casanueva I, Ultes S, Ramadan O, Gašić M. MultiWOZ — a large-scale multi-domain wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of EMNLP 2018, 2018: 5016 – 5026.

[7] Zhou X, Wang Y. Emotion-aware chatbots: a survey. ACM Computing Surveys, 2021, 54 (6): 133. https://doi.org/10.1145/3472728.

[8] Zhang Z, Takanobu R, Huang X, Zhu Q. Recent advances and challenges in empathetic dialogue systems. AI Open, 2022, 3: 14 – 25. https://doi.org/10.1016/j.aiopen.2022.01.002.

[9] Rashkin H, Smith E M, Li M, Boureau Y-L. Towards empathetic open-domain conversation models: a new benchmark and dataset. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5370 – 5381.

[10] Cheng H, Fang H, Ostendorf M. A dynamic speaker model for conversational interactions. In: Proceedings of NAACL-HLT 2019 (Long Papers), 2019: 2772 – 2785.

[11] Poria S, Hazarika D, Majumder N, Mihalcea R. The future of emotion recognition in conversation. In: Proceedings of EMNLP-IJCNLP 2019 (Short Papers), 2019: 1 – 5.

[12] Maas A L, Daly R E, Pham P T, Huang D, Ng A Y, Potts C. Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011: 142 – 150.

[13] Pang B, Lee L. Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2008, 2 (1 – 2): 1 – 135. https://doi.org/10.1561/1500000011.

[14] Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT 2019, 2019: 4171 – 4186.

[15] Sun C, Qiu X, Xu Y, Huang X. How to fine-tune BERT for text classification? In: Chinese Computational Linguistics, 2019: 194 – 206. Springer, Cham. https://doi.org/10.1007/978 - 3 - 030 - 33424 - 7_15.

[16] Wang Y, Huang M, Zhu X. Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of EMNLP 2016, 2016: 606 – 615.

[17] Asghar N, Poupart P, Hoey J, Jiang X, Mou L. Affective neural response generation. In: Advances in Information Retrieval, 2018: 154 – 166. Springer.

[18] Zhou H, Huang M, Zhang T, Zhu X, Liu B. Emotional chatting machine: emotional conversation generation with internal and external memory. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32.

[19] Jaques N, Taylor S, Sano A, Picard R. Predicting tomorrow’s mood, health, and stress level using personalized multitask learning and domain adaptation. In: IJCAI 2017 Workshop on Artificial Intelligence in Affective Computing, 2017: 17 – 33. PMLR.

[20] Li J, Galley M, Brockett C, Gao J, Dolan B. A diversity-promoting objective function for neural conversation models. In: Proceedings of NAACL-HLT 2016, 2016: 110 – 119.

[21] Bostan L A, Klinger R. An analysis of annotated corpora for emotion classification in text. In: Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), 2018: 2104 – 2119.

[22] Sabour S, Zheng C, Huang M. CEM: commonsense-aware empathetic response generation. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022), 2022, 36 (10): 11229 – 11237.

[23] Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J. Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 2204 – 2213.

[24] Li Y, Su H, Shen X, Li W, Cao Z, Niu S. DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017), 2017: 986 – 995.

[25] Madotto A, Lin Z, Wu C S, Fung P. Personalizing dialogue systems via meta-learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 5454 – 5459.

[26] Pan S J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2009, 22 (10): 1345 – 1359. https://doi.org/10.1109/TKDE.2009.191.

[27] Gao J, Galley M, Li L. Neural approaches to conversational AI. In: Proceedings of SIGIR 2018, 2018: 1371 – 1374.

[28] Wu Y, Wu W, Xing C, Xu C, Li Z, Zhou M. A sequential matching framework for multi-turn response selection in retrieval-based chatbots. Computational Linguistics, 2019, 45 (1): 163 – 197. https://doi.org/10.1162/coli_a_00320.

[29] Zadeh A, Liang P, Mazumder N, Poria S, Cambria E, Morency L-P. Memory fusion network for multi-view sequential learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32.

[30] Tsai Y H, Bai S, Liang P, Kolter J Z, Morency L-P, Salakhutdinov R. Multimodal transformer for unaligned multimodal language sequences. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 6558 – 6569.

[31] Li Y, Kazemeini A, Mehta Y, Cambria E. Multitask learning for emotion and personality traits detection. Neurocomputing, 2022, 493: 340 – 350. https://doi.org/10.1016/j.neucom.2022.03.023.

[32] Majumder N, Poria S, Hazarika D, Mihalcea R, Gelbukh A, Cambria E. DialogueRNN: an attentive RNN for emotion detection in conversations. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33 (01): 6818 – 6825.

[33] Deriu J, Rodrigo A, Otegi A, Echegoyen G, Rosset S, Agirre E, Cieliebak M. Survey on evaluation methods for dialogue systems. Artificial Intelligence Review, 2021, 54 (1): 755 – 810. https://doi.org/10.1007/s10462 - 020 - 09881 - 0.

[34] Liu C W, Lowe R, Serban I V, Noseworthy M, Charlin L, Pineau J. How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: Proceedings of EMNLP 2016, 2016: 2122 – 2132.

[35] Bostan L A, Klinger R. An analysis of annotated corpora for emotion classification. In: Proceedings of EMNLP 2020, 2020: 1 – 5.