Machine Learning-Based Models for Accurate Car Prices Prediction

Authors

  • Chenguang Li

DOI:

https://doi.org/10.54097/9zcpv779

Keywords:

Car Price Prediction, Machine Learning, Random Forest, Linear Regression, Decision Tree.

Abstract

The used car market is becoming more and more popular in various countries, but some car trading platforms have inaccuracies in predicting prices. So, it is necessary to select machine learning models with high accuracy to predict used car prices for both buyers and sellers. This study selected three factors that have the greatest impact on used car prices, which are: car name, car use year, and car use mileage. In the selection of dataset, this study chooses more than 2000 data from four brands of BMW, Volkswagen, Acura, and Tesla which are very popular in the market to predict. In the part of comparing the accuracy of machine learning models, this paper uses three machine learning models: linear regression, decision tree regressor and random forest regressor. Based on criteria such as root-mean-square error and R-variance, with K-Fold cross validation method to compare the advantages and disadvantages of every model. Through the analysis, all the test criteria consistently show that random forest regressor has the highest performance and can achieve an R-square value of 0.8562. After getting the suitable model, accurate price prediction will be realized in more car trading platforms to help buyers and sellers to understand the real price of used cars.

Downloads

Download data is not yet available.

References

Singapore Used Car Market Outlook Report 2022-2025: Increasing Used Cars Demand Due to the Pandemic Contributes to Increase in Used Cars Sales During the Economic Crisis - ResearchAndMarkets.com. Businesswire, 2022, June 6. Available at: https://www.businesswire.com/news/home/20220606005609/en/.

Qiu Y, Yang Y, Lin Z, Chen P, Luo Y, Huang W. Improved denoising autoencoder for maritime image denoising and semantic segmentation of USV. China Communications. 2020 Mar;17 (3): 46-57.

Wu Y, Jin Z, Shi C, Liang P, Zhan T. Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis. arXiv preprint arXiv:2403.08217. 2024 Mar 13.

Wang H, Zhou Y, Perez E, Roemer F. Jointly Learning Selection Matrices for Transmitters, Receivers and Fourier Coefficients in Multichannel Imaging. arXiv preprint arXiv:2402.19023. 2024 Feb 29.

Li M, He J, Jiang G, Wang H. DDN-SLAM: Real-time Dense Dynamic Neural Implicit SLAM with Joint Semantic Encoding. arXiv preprint arXiv:2401.01545. 2024 Jan 3.

Gajera P, Gondaliya A, Kavathiya J. Old car price prediction with machine learning. Int. Res. J. Mod. Eng. Technol. Sci, 2021, 3: 284-290.

Miah MBA, Hossain MZ, Hossain MA, Islam MM. Price prediction of stock market using hybrid model of artificial intelligence. International Journal of Computer Applications, 2015, 111 (3).

Gegic E, Isakovic B, Keco D, Masetic Z, Kevric J. Car price prediction using machine learning techniques. TEM Journal, 2019, 8 (1): 113.

scikit-learn developers. sklearn.preprocessing.LabelEncoder — scikit-learn 0.22.1 documentation. 2019. Available at: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html.

Costa-Luis C da. tqdm.notebook - tqdm documentation. Available at: https://tqdm.github.io/docs/notebook/.

Kaggle Carvana - Predict Car Prices. Available at: https://www.kaggle.com/datasets/ravishah1/carvana-predict-car-prices.

Scikit-learn. scikit-learn: machine learning in Python. 2019. Available at: https://scikit-learn.org/stable/.

Benesty J, Chen J, Huang Y, Cohen I. Noise reduction in speech processing. Springer Science & Business Media, 2009, Vol. 2.

Anguita D, Ghelardoni L, Ghio A, Oneto L, Ridella S. The'K'in K-fold Cross Validation. ESANN, 2012, Vol. 102, pp. 441-446.

Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. John Wiley & Sons, 2021.

Quinlan JR. Induction of decision trees. Machine learning, 1986, 1: 81-106.

Liaw A, Wiener M. Classification and regression by randomForest. R news, 2002, 2 (3): 18-22.

Shanmugasundar G, Vanitha M, Čep R, Kumar V, Kalita K, Ramachandran M. A comparative study of linear, random forest and adaboost regressions for modeling non-traditional machining. Processes, 2021, 9(11): 2015.

Oshiro TM, Perez PS, Baranauskas JA. How many trees in a random forest? In Machine Learning and Data Mining in Pattern Recognition: 8th International Conference, MLDM 2012, Berlin, Germany, July 13-20, 2012. Proceedings 8, Springer Berlin Heidelberg, 2012, pp. 154-168.

Downloads

Published

01-09-2024

How to Cite

Li, C. (2024). Machine Learning-Based Models for Accurate Car Prices Prediction. Highlights in Business, Economics and Management, 40, 416-421. https://doi.org/10.54097/9zcpv779