Improving NLP Accuracy with Stack-Propagation and Knowledge Distillation: A Joint Model for Intent Detection and Slot Filling
DOI:
https://doi.org/10.54097/fcis.v3i2.7525Keywords:
Natural Language Understanding, Slot-filling, Intent Detection, Stack-Propagation, Knowledge DistillationAbstract
Intent detection and slot filling are fundamental tasks in Natural Language Understanding (NLU), constituting important components of intelligent question-answering systems. These tasks are closely interrelated, forming a core aspect of semantic understanding in natural language processing. In this paper, we propose an architecture using Stack-Propagation to improve the accuracy of NLU tasks. In stack propagation, we use a joint model that mainly incorporates token-level intent detection data with sentence word vectors as input for slot filling to capture intent semantic knowledge. Additionally, we use knowledge distillation (KD) to improve model efficiency and enhance the correlation between the two tasks. Our proposed framework significantly differs from existing joint models as it directly leverages intent information in the joint model and adopts token-level intent information for slot filling to ease error propagation. Furthermore, our model can explicitly incorporate intent information for slot filling with Stack-Propagation, making the interaction procedure more interpretable, while other models only interact with hidden states implicitly between the two tasks. We experimentally evaluated our model on two publicly available datasets, and the results demonstrate that it achieves state-of-the-art performance and outperforms previous methods by a significant margin, indicating its superiority in addressing the slot-filling and intent detection tasks. In future research, we will combine the stack-propagation frame with the KD module in the Transform model, which can boost our model performance to move forward a single step in the NLU task.
Downloads
References
Jurafsky, D., & Martin, J. H. (2019). Speech and language processing (3rd ed.). Cambridge University Press.
Chen, L., Zhang, Y., Du, N., Liu, X., & Sun, M. (2019). Multi-task learning for joint language understanding and dialogue state tracking. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 6901-6908).
Li, X., Chen, Q., Li, X., Du, N., & Zhou, D. (2018). A self-attentive model with gate mechanism for spatiotemporal slot filling in spoken language understanding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4019-4028).
Zhang, Y., Chen, Y. N., & Chen, W. (2019). Joint extraction of entities and relations based on a novel graph scheme in natural language processing. IEEE Access, 7, 47493-47505.
Liu, B., & Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. In Proceedings of the Association for Computational Linguistics (ACL) (pp. 1-10).
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).


