White Wine Quality Prediction and Analysis with Machine Learning Techniques

Authors

  • Xianghui Jiang
  • Xuanyu Liu
  • Yutong Wu
  • Dehuai Yang

DOI:

https://doi.org/10.54097/hset.v39i.6548

Keywords:

White Wine; Quality Prediction; SMOTE; Random Forests; Multiple Logistic Regression.

Abstract

The wine originated 7,500 years ago, and it is one of the most popular alcoholic beverages in the world today. With modern advances in production processes, the quality of wine has been significantly improved. White wines can be graded by measuring various physical and chemical quantities. In reality, it is very rare for white wines to have dramatic quality differences, so precisely capturing these important characteristics will be difficult. In addition, the available dataset for wine quality is relatively small and imbalanced, so it is not easy to train a machine learning model for wine quality prediction. This paper explores the applicability of applying logistic regression machine learning models for wine quality prediction. The Synthetic Minority Oversampling Technique (SMOTE) algorithm is applied to address the imbalanced data issue. The key features in wine production are analyzed to the physical or chemical quantities that are most likely to influence the quality of white wines. In several tests, the random forest model gives good results, and free sulfur dioxide, chlorides, and fixed acidity are the most likely characteristics to influence the quality of white wines.

Downloads

Download data is not yet available.

References

Philippe Testard-Vaillant, A nectar for 7500 years, CNRS, consulted on 4 June 2010 (in French).

Hugh Johnson, A World History of Wine from Antiquity to Modern Times, Hachette, 1990, ISBN 2-01-015867-9, 464 pages, p46.

jellederoeck. "How White Wine Is Made from Grapes to Glass." Wine Folly, https://winefolly.com/deep-dive/how-is-white-wine-made/.

Tilden III, Marshall. "How to Taste Wine like a Pro." Wine Enthusiast, 16 Mar. 2021, https:// www. winemag. com/2018/07/10/wine-tasting-grids/.

P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 2009, 47(4):547-553.

A. Asuncion, D. Newman UCI Machine Learning Repository, University of California, Irvine(2007) http:// www.ics.uci.edu/~mlearn/MLRepository.html.

S. Ebeler. Flavor Chemistry --Thirty Years of Progress, Kluwer Academic Publishers,1999:409-422. chapter Linking flavour chemistry to sensory analysis of wine.

N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, Vol. 16 (2002).

Gerard Biau. Analysis of a Random Forests Model, Journal of Machine Learning Research 13 (2012) 1063-1095.

Prasad, K. D. V., & Vaidya, R. (2018). Causes and effect of occupational stress and coping on performance with special reference to length of service: An empirical study using multinomial logistic regression approach. Psychology, 9(10), 2457-2470.

Downloads

Published

01-04-2023

How to Cite

Jiang, X., Liu, X., Wu, Y., & Yang, D. (2023). White Wine Quality Prediction and Analysis with Machine Learning Techniques. Highlights in Science, Engineering and Technology, 39, 321-326. https://doi.org/10.54097/hset.v39i.6548