Research on an Integrated Intelligent Classification Algorithm Based on K-Means PCA-RF Machine Learning

Authors

  • Bingkun Song
  • Wenxuan Fan
  • Shuo Zhang

DOI:

https://doi.org/10.54097/hset.v49i.8398

Keywords:

Intelligent classification, K-means, machine learnings, random forest.

Abstract

With the rapid development of machine learning and artificial intelligence, research on classification models is gradually becoming popular. This article aims to propose a general classification model and classify indicator features. First, this paper constructs the data preprocessing based on K-Means, and data dimensionality reduction based on PCA algorithm. Finally, random forest algorithm (RF) is used for feature classification, and 325 groups of data are used for training. The results show that: (1) The K-Means PCA-RF algorithm constructed in this paper has good robustness and classification performance. (2) K-Means PCA-RF can effectively classify features and perform sensitivity analysis.

Downloads

Download data is not yet available.

References

L. Dussubieux et al, “The trading of ancient glass beads: new analytical data from south asian and east african soda-alumina glass beans,” Archaeometry, vol. 50, (5), pp. 797-821, 2008.

Y. Guo, T. Wu and IOP, "Analysis on the development of ancient glass technology in the Central Plains of China in the pre Qin Period," 2020 6th International Conference on Energy, Environment and Materials Science, vol. 585, (1), pp. 12206, 2020. Cancel

N. Welter, U. Schüssler and W. Kiefer, “Characterisation of inorganic pigments in ancient glass beads by means of Raman microspectroscopy, microprobe analysis and X-ray diffractometry,” Journal of Raman Spectroscopy, vol. 38, (1), pp. 113-121, 2007.

S. Saminpanya et al, “Shedding New Light on Ancient Glass Beads by Synchrotron, SEM-EDS, and Raman Spectroscopy Techniques,” Scientific Reports, vol. 9, (1), pp. 16069-12, 2019.

A. Gallo et al, "Use of principal component analysis to classify forages and predict their calculated energy content," Animal (Cambridge, England), vol. 7, (6), pp. 930-939, 2013.

N. O’Rourke, L. Hatcher and E. J. Stepanski, “A Step-by-step approach to using SASR for univariate and multivariate statistics,” 2nd edition. SAS Institute Inc., SAS Campus Drive, Cary, North Carolina, USA. 2005.

J. P. Stevens, “Applied Multivariate Statistics for the Social Sciences”, 5th edition. Routledge, Taylor & Francis Group, New York, USA. 2009.

S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory, vol. 28, no. 2, pp. 129–137, Mar. 1982.

J. A. Hartigan, Clustering Algorithms. New York, NY, USA: Wiley, 1975.

C. Strobl et al, "Bias in random forest variable importance measures: Illustrations, sources and a solution," BMC Bioinformatics, vol. 8, (1), pp. 25-25, 2007.

Downloads

Published

21-05-2023

How to Cite

Song, B., Fan, W., & Zhang, S. (2023). Research on an Integrated Intelligent Classification Algorithm Based on K-Means PCA-RF Machine Learning. Highlights in Science, Engineering and Technology, 49, 20-29. https://doi.org/10.54097/hset.v49i.8398