Model Comparison in Sentiment Analysis: A Case Study of Amazon Product Reviews
DOI:
https://doi.org/10.54097/hset.v16i.2224Keywords:
Natural Language Processing, Multi-class Classification, Model Comparison, Hybrid Sequential Binary Classification (HSBC).Abstract
Sentiment analysis is essential in NLP, especially in businesses because it can improve customer services. This paper focuses on a particular case of sentiment analysis, a case study of Amazon reviews of books on kindle. Firstly, this paper applies several non-deep-learning algorithms including Logistic Regression, Naïve Bayes, Support Vector Machine, Convolutional Neural Network, and Recurrent Neural Network, and compares their accuracies. Especially, for deep learning methods, this paper studies the slope of accuracies concerning the number of hidden layers. Secondly, as a multi-class text classification problem, the product review data set has five labels ranging from one star to five stars, a new method called Hybrid Sequential Binary Classification (HSBC) is introduced in this paper, which improves the behavior of classical binary classifiers on a multi-class text classification problem. Moreover, a comparison of HSBC and multi-class classification models is presented.
Downloads
References
Walaa Medhat, Ahmed Hassan, Hodi Korashy, Sentiment analysis algorithms, and applications: A survey, Ain Shams Engineering Journal, Volume 5, Issue 4, 2014, Pages 10931113, ISSN 2090-447.
Vishal A. Kharde, S.S. Sonawane, Sentiment Analysis of Twitter Data: A Survey of Techniques, International Journal of Computer Applications (0975 – 8887) Volume 139 – No.11, April 2016
B. Bhutani, N. Rastogi, P. Sehgal, and A. Purwar, "Fake News Detection Using Sentiment Analysis," 2019 Twelfth International Conference on Contemporary Computing (IC3), 2019, pp. 1-5, doi: 10.1109/IC3.2019.8844880.
Alsafy, Baidaa & Mosad, Zahoor & Mutlag, Wamidh. Multiclass Classification Methods: A Review. International Journal of Advanced Engineering Technology and Innovative Science (IJAETIS), 2020.
Fang, X., Zhan, J. Sentiment analysis using product review data. Journal of Big Data 2, 5 (2015). https://doi.org/10.1186/s40537-015-0015-2.
Abinash Tripathy, Ankit Agrawal, Santanu Kumar Rath, Classification of Sentimental Reviews Using Machine Learning Techniques, Procedia Computer Science, Volume 57,2015, Pages 821-829, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2015.07.523.
D. Phuc and N. T. K. Phung, "Using Naïve Bayes Model and Natural Language Processing for Classifying Messages on Online Forum," 2007 IEEE International Conference on Research, Innovation, and Vision for the Future, 2007, pp. 247-252, doi: 10.1109/RIVF.2007.369164.
T. Pranckevičius and V. Marcinkevičius, "Application of Logistic Regression with part-of-the-speech tagging for multi-class text classification," 2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE), 2016, pp. 1-5, doi: 10.1109/AIEEE.2016.7821805.
Koby Crammer, Yoram Singer, On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines, Journal of Machine Learning Research 2 (2001) 265-292.
I M Rabbimov, S S Kobilov, Multi-Class Text Classification of Uzbek News Articles using Machine Learning, Journal of Physics: Conference Series 1546 (2020) 012097.
Yoon Kim. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pages 1746–1751, Doha, Qatar. Association for Computational Linguistics.
Francesco Gargiulo, Stefano Silvestri, and Mario Ciampi, Deep Convolution Neural Network for Extreme Multi-label Text Classification, Institute for High-Performance Computing and Networking, ICAR-CNR, Via Pietro Castellino 111 - 80131, Naples, Italy.
Pengfei Liu, Xipeng and Qiu Xuanjing Huang, Recurrent Neural Network for Text Classification with Multi-Task Learning, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16).
Jenq-Haur Wang, An LSTM Approach to Short Text Sentiment Classification with Word Embeddings, The 2018 Conference on Computational Linguistics and Speech Processing ROCLING 2018, pp. 214-223.
Y. Luan and S. Lin, "Research on Text Classification Based on CNN and LSTM," 2019 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2019, pp. 352-355, doi: 10.1109/ICAICA.2019.8873454.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







