Deep Learning Based Text Classification Methods

Zhaoguo Wang

doi:10.54097/hset.v34i.5478

Authors

Zhaoguo Wang

DOI:

https://doi.org/10.54097/hset.v34i.5478

Keywords:

text classification, natural language processing, deep learning.

Abstract

Text classification tasks are indispensable in natural language processing. With the development of Internet technology, the way people transmit information has changed from letters to the Internet. With the increase in the amount of information, manual data annotation is inefficient. After 2010, the emergence of deep learning methods has brought text classification into an epoch-making stage. ReNN-> MLP-> RNN-> CNN-> Attention -> Transformer-> GNN and other text classification methods are gradually being developed and known by everyone, which also shows that text classification is developing towards a text feature that does not rely on manually acquired text features and is directly learning from the text content and modelling. This paper will start with basic knowledge, first let everyone understand the nature, application and historical background of text classification, will briefly introduce shallow learning, then enter deep learning, Select the classic model from the six classification methods from ReNN to Transformer for brief analysis, briefly analyse the model from the principle of the model, what problems it is good at solving, problems or shortcomings that it is not good at solving. Finally, the paper describes the performance of these models on the dataset.

Downloads

Download data is not yet available.

References

Socher R et al. Semi-supervised recursive autoencoders for predicting sentiment distributions [C]. In Proceedings of the 2011 conference on empirical methods in natural language processing (pp. 151-161), 2011

Socher R et al. Semantic compositionality through recursive matrix-vector spaces [C]. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 1201-1211), 2012.

Socher R et al. Recursive deep models for semantic compositionality over a sentiment treebank [C]. In Proceedings of the 2013 conference on empirical methods in natural language processing (pp. 1631-1642), 2013.

Iyyer M et al. Deep unordered composition rivals syntactic methods for text classification [C]. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers) (pp. 1681-1691), 2015.

Shi X et al. Convolutional LSTM network: A machine learning approach for precipitation nowcasting [J]. Advances in neural information processing systems, 28, 2015

Cho K et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [R]. arXiv preprint arXiv:1406.1078, 2014.

Tai K S et al. Improved semantic representations from tree-structured long short-term memory networks [R]. arXiv preprint arXiv:1503.00075, 2015

Kim Y. Convolutional neural networks for sentence classification [R], in Proc. EMNLP, 2014, pp. 1746–1751, 2014.

Kalchbrenner N et al. A convolutional neural network for modelling sentences [R]. arXiv preprint arXiv:1404.2188, 2014.

Zhang, X et al. Character-level convolutional networks for text classification [J]. Advances in neural information processing systems, 28, 2015.

Yang Z et al. Hierarchical attention networks for document classification [R]. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489), 2016.

M. E. P et al. Deep contextualized word representations [C], in Proc. NAACL, 2018, pp. 2227–2237, 2018

Devlin J et al. Bert: Pre-training of deep bidirectional transformers for language understanding [R]. arXiv preprint arXiv:1810.04805, 2018

Joshi M et al. Spanbert: Improving pre-training by representing and predicting spans [J]. Transactions of the Association for Computational Linguistics, 8, 64-77, 2020.

Lan Z et al. ALBERT: A lite BERT for self-supervised learning of language representations in Proc. [C] ICLR, 2020.

Li Q et al. A Survey on Text Classification: From Traditional to Deep Learning [J]. ACM Transactions on Intelligent Systems and Technology (TIST), 13(2), 1-41, 2022.

Deep Learning Based Text Classification Methods

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Indexing

Latest publications