Machine Learning-Based Models for Accurate Car Prices Prediction
DOI:
https://doi.org/10.54097/9zcpv779Keywords:
Car Price Prediction, Machine Learning, Random Forest, Linear Regression, Decision Tree.Abstract
The used car market is becoming more and more popular in various countries, but some car trading platforms have inaccuracies in predicting prices. So, it is necessary to select machine learning models with high accuracy to predict used car prices for both buyers and sellers. This study selected three factors that have the greatest impact on used car prices, which are: car name, car use year, and car use mileage. In the selection of dataset, this study chooses more than 2000 data from four brands of BMW, Volkswagen, Acura, and Tesla which are very popular in the market to predict. In the part of comparing the accuracy of machine learning models, this paper uses three machine learning models: linear regression, decision tree regressor and random forest regressor. Based on criteria such as root-mean-square error and R-variance, with K-Fold cross validation method to compare the advantages and disadvantages of every model. Through the analysis, all the test criteria consistently show that random forest regressor has the highest performance and can achieve an R-square value of 0.8562. After getting the suitable model, accurate price prediction will be realized in more car trading platforms to help buyers and sellers to understand the real price of used cars.
Downloads
References
Singapore Used Car Market Outlook Report 2022-2025: Increasing Used Cars Demand Due to the Pandemic Contributes to Increase in Used Cars Sales During the Economic Crisis - ResearchAndMarkets.com. Businesswire, 2022, June 6. Available at: https://www.businesswire.com/news/home/20220606005609/en/.
Qiu Y, Yang Y, Lin Z, Chen P, Luo Y, Huang W. Improved denoising autoencoder for maritime image denoising and semantic segmentation of USV. China Communications. 2020 Mar;17 (3): 46-57.
Wu Y, Jin Z, Shi C, Liang P, Zhan T. Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis. arXiv preprint arXiv:2403.08217. 2024 Mar 13.
Wang H, Zhou Y, Perez E, Roemer F. Jointly Learning Selection Matrices for Transmitters, Receivers and Fourier Coefficients in Multichannel Imaging. arXiv preprint arXiv:2402.19023. 2024 Feb 29.
Li M, He J, Jiang G, Wang H. DDN-SLAM: Real-time Dense Dynamic Neural Implicit SLAM with Joint Semantic Encoding. arXiv preprint arXiv:2401.01545. 2024 Jan 3.
Gajera P, Gondaliya A, Kavathiya J. Old car price prediction with machine learning. Int. Res. J. Mod. Eng. Technol. Sci, 2021, 3: 284-290.
Miah MBA, Hossain MZ, Hossain MA, Islam MM. Price prediction of stock market using hybrid model of artificial intelligence. International Journal of Computer Applications, 2015, 111 (3).
Gegic E, Isakovic B, Keco D, Masetic Z, Kevric J. Car price prediction using machine learning techniques. TEM Journal, 2019, 8 (1): 113.
scikit-learn developers. sklearn.preprocessing.LabelEncoder — scikit-learn 0.22.1 documentation. 2019. Available at: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html.
Costa-Luis C da. tqdm.notebook - tqdm documentation. Available at: https://tqdm.github.io/docs/notebook/.
Kaggle Carvana - Predict Car Prices. Available at: https://www.kaggle.com/datasets/ravishah1/carvana-predict-car-prices.
Scikit-learn. scikit-learn: machine learning in Python. 2019. Available at: https://scikit-learn.org/stable/.
Benesty J, Chen J, Huang Y, Cohen I. Noise reduction in speech processing. Springer Science & Business Media, 2009, Vol. 2.
Anguita D, Ghelardoni L, Ghio A, Oneto L, Ridella S. The'K'in K-fold Cross Validation. ESANN, 2012, Vol. 102, pp. 441-446.
Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. John Wiley & Sons, 2021.
Quinlan JR. Induction of decision trees. Machine learning, 1986, 1: 81-106.
Liaw A, Wiener M. Classification and regression by randomForest. R news, 2002, 2 (3): 18-22.
Shanmugasundar G, Vanitha M, Čep R, Kumar V, Kalita K, Ramachandran M. A comparative study of linear, random forest and adaboost regressions for modeling non-traditional machining. Processes, 2021, 9(11): 2015.
Oshiro TM, Perez PS, Baranauskas JA. How many trees in a random forest? In Machine Learning and Data Mining in Pattern Recognition: 8th International Conference, MLDM 2012, Berlin, Germany, July 13-20, 2012. Proceedings 8, Springer Berlin Heidelberg, 2012, pp. 154-168.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






