White Wine Quality Prediction and Analysis with Machine Learning Techniques
DOI:
https://doi.org/10.54097/hset.v39i.6548Keywords:
White Wine; Quality Prediction; SMOTE; Random Forests; Multiple Logistic Regression.Abstract
The wine originated 7,500 years ago, and it is one of the most popular alcoholic beverages in the world today. With modern advances in production processes, the quality of wine has been significantly improved. White wines can be graded by measuring various physical and chemical quantities. In reality, it is very rare for white wines to have dramatic quality differences, so precisely capturing these important characteristics will be difficult. In addition, the available dataset for wine quality is relatively small and imbalanced, so it is not easy to train a machine learning model for wine quality prediction. This paper explores the applicability of applying logistic regression machine learning models for wine quality prediction. The Synthetic Minority Oversampling Technique (SMOTE) algorithm is applied to address the imbalanced data issue. The key features in wine production are analyzed to the physical or chemical quantities that are most likely to influence the quality of white wines. In several tests, the random forest model gives good results, and free sulfur dioxide, chlorides, and fixed acidity are the most likely characteristics to influence the quality of white wines.
Downloads
References
Philippe Testard-Vaillant, A nectar for 7500 years, CNRS, consulted on 4 June 2010 (in French).
Hugh Johnson, A World History of Wine from Antiquity to Modern Times, Hachette, 1990, ISBN 2-01-015867-9, 464 pages, p46.
jellederoeck. "How White Wine Is Made from Grapes to Glass." Wine Folly, https://winefolly.com/deep-dive/how-is-white-wine-made/.
Tilden III, Marshall. "How to Taste Wine like a Pro." Wine Enthusiast, 16 Mar. 2021, https:// www. winemag. com/2018/07/10/wine-tasting-grids/.
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 2009, 47(4):547-553.
A. Asuncion, D. Newman UCI Machine Learning Repository, University of California, Irvine(2007) http:// www.ics.uci.edu/~mlearn/MLRepository.html.
S. Ebeler. Flavor Chemistry --Thirty Years of Progress, Kluwer Academic Publishers,1999:409-422. chapter Linking flavour chemistry to sensory analysis of wine.
N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, Vol. 16 (2002).
Gerard Biau. Analysis of a Random Forests Model, Journal of Machine Learning Research 13 (2012) 1063-1095.
Prasad, K. D. V., & Vaidya, R. (2018). Causes and effect of occupational stress and coping on performance with special reference to length of service: An empirical study using multinomial logistic regression approach. Psychology, 9(10), 2457-2470.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







