Analysis of the Different Statistical Metrics in Machine Learning

Authors

  • Shukun Geng

DOI:

https://doi.org/10.54097/jhq3tv19

Keywords:

Machine learning metrics; model evaluation; classification regression clustering.

Abstract

The evaluation of machine learning models plays a pivotal role in ensuring their effectiveness across various domains. Metrics serve as vital tools for this purpose, quantifying model performance in tasks, e.g., classification, regression, and clustering. This study delves into the fundamental metrics used in machine learning, presenting their formulas and applications, including accuracy, precision, F1-Score, RMSE, and the Silhouette Score. The analysis underscores the importance of selecting metrics tailored to specific tasks, acknowledging the potential biases and interpretability challenges that may arise. While metrics provide invaluable insights, they also exhibit limitations, particularly in cases where trade-offs between metrics are inevitable. Looking to the future, this study envisions a landscape where multi-metric assessments, improved interpretability, domain-specific metrics, and explainable AI converge to address current limitations. These advancements promise more robust and transparent model evaluations, adapting to dynamic real-world applications. In summary, this exploration of metrics in machine learning highlights their crucial role in benchmarking model performance, fostering the development of reliable AI systems, and shaping transformative applications in diverse fields. Metrics not only aid in informed decision-making but also contribute to advancements in science, industry, and society.

Downloads

Download data is not yet available.

References

Lecun Y, Bengio Y, Hinton G. Deep learning. nature, 2015, 521(7553): 436-444.

He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding arXiv preprint arXiv:1810.04805, 2018.

Silver D, Hubert T, Schrittwieser J, et al. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017.

Abadi M, Barham P, Chen J, et al. {TensorFlow}: a system for {Large-Scale} machine learning. 12th USENIX symposium on operating systems design and implementation (OSDI 16). 2016: 265-283.

Paszke A, Gross S, Massa F, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019, 32.

Esteva A, Kuprel B, Novoa R A, et al. Dermatologist-level classification of skin cancer with deep neural networks. nature, 2017, 542(7639): 115-118.

Esteva A, Kuprel B, Novoa R A, et al. Dermatologist-level classification of skin cancer with deep neural networks. nature, 2017, 542(7639): 115-118.

Dal Pozzolo A, Caelen O, Johnson R A, et al. Calibrating probability with undersampling for unbalanced classification. 2015 IEEE symposium series on computational intelligence. IEEE, 2015: 159-166.

Smith L N, Topin N. Super-convergence: Very fast training of neural networks using large learning rates. Artificial intelligence and machine learning for multi-domain operations applications. SPIE, 2019, 11006: 369-386.

Liu F T, Ting K M, Zhou Z H. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 2012, 6(1): 1-39.

Downloads

Published

29-03-2024

How to Cite

Geng, S. (2024). Analysis of the Different Statistical Metrics in Machine Learning. Highlights in Science, Engineering and Technology, 88, 350-356. https://doi.org/10.54097/jhq3tv19