Research on the Diamond Price Prediction based on Linear Regression, Decision Tree and Random Forest


  • Zhe OuYang



Multiple Linear Regression, Decision Tree Regression, Random Forest Regression, Diamond Price Prediction


Diamonds are the symbol of the pure and indestructible love and the luxury that people have always sought after. However, because people know less about diamonds, they often only rely on the introduction of salespeople and jewelers in diamond trading. Therefore, it is difficult for consumers to buy diamonds of equal value and price. To solve this problem, this paper uses Multiple Linear Regression model, Decision Tree Regression model and Random Forest Regression model to predict diamond prices based on various diamond evaluation metrics in data set, so that consumers can intuitively learn about the normal price of the evaluation metrics of selected diamonds. Through this paper, it is found that the Random Forest Regression model has the best fitting and predictive ability in diamond prediction task, which is also the most recommended model.


Download data is not yet available.


Matlins A. Jewelry & Gems. The Buying Guide: How to Buy Diamonds, Pearls, Precious and Other Popular Gems with Confidence and Knowledge. Springer Science & Business Media, 2012.

Alsuraihi W., Al-Hazmi E., Bawazeer K., et al. Machine learning algorithms for diamond price prediction. In Proceedings of the 2020 2nd International Conference on Image, Video and Signal Processing, 2020, 150-154.

Sharma G., Tripathi V., Mahajan M., et al. Comparative analysis of supervised models for diamond price prediction. In 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 2021, 1019-1022.

Fitriani S. A., Astuti Y., Wulandari I. R. Least Absolute Shrinkage and Selection Operator (LASSO) and k-Nearest Neighbors (k-NN) Algorithm Analysis Based on Feature Selection for Diamond Price Prediction. In 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), 2022, 135-139.

Kigo S. N. Assessing predictive performance of supervised machine learning algorithms: an alternative model for diamond pricing. Strathmore University, 2022.

White Diamond Search. GIA Diamond Trading Website,

Sakia R. M. The Box-Cox transformation technique: a review. Journal of the Royal Statistical Society Series D: The Statistician, 1992, 41(2): 169-178.

Picard R. R., Berk K. N. Data splitting. The American Statistician, 1990, 44(2): 140-147.

Linear regression. Wikipedia, the free encyclopedia, 2023,

Blanchet F. G., Legendre P., Borcard D. Forward selection of explanatory variables. Ecology, 2008, 89(9): 2623-2632.

Li Hang. Statistical Learning Methods. Tsinghua University Press, 2019, 55-73.

Random forest. Wikipedia, the free encyclopedia, 2023,

Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). 2019,

Fratello M., Tagliaferri R., Decision trees and random forests. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 2018.




How to Cite

OuYang, Z. (2024). Research on the Diamond Price Prediction based on Linear Regression, Decision Tree and Random Forest. Highlights in Business, Economics and Management, 24, 248-257.