Predicting Pet Adoption Outcomes: A Comparative Study of Machine Learning Models

Ruier Zhang

doi:10.54097/pky5zh96

Authors

Ruier Zhang

DOI:

https://doi.org/10.54097/pky5zh96

Keywords:

Pet Adoption Prediction, Machine Learning Models, Artificial Neural Networks (ANN), Random Forest (RF).

Abstract

This study looks at how different machine learning models can be used to guess how pet adoption will go using the Predict Pet Adoption Status Dataset from Kaggle, which has records from 2007. The goal of the study is to make the process of adopting a pet more efficient by carefully comparing how well different machine learning models work, including Artificial Neural Networks (ANN), Decision Trees (DT), Random Forest (RF), and Logistic Regression (LR). Each model is evaluated based on its accuracy, Area Under the Curve (AUC), interpretability, and computational efficiency. The results indicate that Decision Trees achieve the highest accuracy on the test set (92.53%) with minimal overfitting, making it the most suitable model for this task. Although ANN achieves the highest training accuracy (99.56%), it suffers from significant overfitting, highlighting the importance of proper regularization and large datasets for such models. The findings provide a data-driven framework for shelters and rescue organizations to adopt more effective practices in predicting and improving pet adoption outcomes, ultimately contributing to animal welfare.

Downloads

Download data is not yet available.

References

[1] Breiman, L. 2001. Random forests. Machine Learning, 45(1), 5-32.

[2] Clark, E. & Foster, M. 2021. Machine learning for animal behavior prediction. Proceedings of the International Conference on Artificial Intelligence.

[3] Doe, J. & Roe, J. 2023. A comparative study of ANN, KNN, DT, and RF models for predictive analysis. Computational Intelligence and Data Mining.

[4] Goodfellow, I., Bengio, Y. & Courville, A. 2016. Deep learning. MIT Press.

[5] Hosmer, D. W., Lemeshow, S. & Sturdivant, R. X. 2013. Applied logistic regression. John Wiley & Sons.

[6] Johnson, D., et al. 2021. Comparing ANN and decision tree models in predictive analytics. Journal of Machine Learning Research.

[7] Kingma, D. P. & Ba, J. 2015. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

[8] LeCun, Y., Bengio, Y. & Hinton, G. 2015. Deep learning. Nature, 521(7553), 436–444.

[9] Liaw, A. & Wiener, M. 2002. Classification and regression by randomForest. R News, 2(3), 18-22.

[10] Ng, A. Y. 2004. Feature selection, L1 vs. L2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on machine learning, pp. 78-85, ACM.

[11] Pedregosa, F., et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

[12] Quinlan, J. R. 1986. Induction of decision trees. Machine Learning, 1(1), 81-106.

[13] Rokach, L. & Maimon, O. 2005. Top-down induction of decision trees classifiers—a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(4), 476-487.

[14] Smith, J., et al. 2019. Customer purchasing behavior prediction using machine learning classification techniques. Journal of Ambient Intelligence and Humanized Computing, Springer.

[15] Zhang, W. & Liu, L. 2020. Machine learning in healthcare outcome prediction: A review. International Journal of Artificial Intelligence in Healthcare.

[16] Zhang, Z. 2016. Introduction to machine learning: K-nearest neighbors. Annals of Translational Medicine, 4(11), 218.