Research on Personalized Recommendations for Optimal Detection Timing Based on K-Means Clustering and Logistic Regression

Authors

  • Jing Ding
  • Huilin Sun
  • Ziquan Feng

DOI:

https://doi.org/10.54097/40vtz891

Keywords:

Multiple Linear Regression Model, K-Means Clustering, Logistic Regression.

Abstract

This paper focuses on selecting the optimal timing for specific detection scenarios. It integrates multiple modeling approaches—including multiple linear regression, K-Means clustering, and logistic regression—to construct a quantitative decision-making system that optimizes detection efficiency and reliability. The study first analyzes the linear correlation between target indicator concentrations and multidimensional features using a Pearson correlation coefficient matrix to identify core influencing variables. Subsequently, a multiple linear regression model is constructed using the statsmodels toolkit for solution and significance verification, precisely quantifying the influence strength of each feature on the target indicator. During data preprocessing, IQR and Z-score methods identify outliers, with effective information retained through trimming. K-Means clustering combined with the elbow rule then determines the optimal number of clusters, dividing key features into four significantly distinct groups. Separate logistic regression models were established for each group to fit the temporal variation patterns of detection success rates. Optimal detection timepoints were determined for each group using a 90% success rate threshold, alongside risk level assessments. The resulting technical framework provides data-driven support for personalized detection timing selection and establishes a quantitative analytical paradigm for optimizing similar detection scenarios.

References

[1]Li Jia'an. Research on Collaborative Saliency Detection Algorithms Based on Multi-Scale Feature Fusion [D]. Nanjing University of Information Science and Technology, 2024. DOI: 10. 27248/ d. cnki. gnjqc.2024.001709.

[2]Zhang Yuhang. Research on Time-Delay Pearson Correlation Analysis and Key Variable Prediction Methods for Air Separation Equipment [D]. Hangzhou Dianzi University, 2023. DOI: 10.27075/d.cnki.ghzdc.2023.000752.

[3]Yan Jueyuan, Shen Xiangyu, Wu Yingkang, et al. From Linear Correlation to Univariate Linear Regression Models: Exploring Relationships Between Two Continuous Variables [J]. Mathematics Teaching, 2025, (10): 27-35+50.

[4]Wang Bingcan., Wang Guochang, & Wei Yanhua. (2025). An improved method for determining the number of clusters based on the k-means algorithm [J]. Statistics and Decision Making, 41(07), 59-64. DOI: 10.13546/j.cnki.tjyjc.2025.07.010.

[5]Kong Liru, Chen Yuming, Fu Xingyu, et al. A Logistic Regression Algorithm Based on Rotational Granularity [J]. Research on Computer Applications, 2024, 41(08): 2398-2403. DOI: 10.19734/j. issn. 1001-3695.2023.11.0578.

[6]Ge Yan. Research on Time Series Anomaly Detection Method Based on Dual Feature Collaborative Analysis [D]. Hangzhou Dianzi University, 2025. DOI: 10.27075/ d. cnki. ghzdc. 2025. 001938.

Downloads

Published

31-12-2025

Issue

Section

Articles

How to Cite

Ding, J., Sun, H., & Feng, Z. (2025). Research on Personalized Recommendations for Optimal Detection Timing Based on K-Means Clustering and Logistic Regression. Mathematical Modeling and Algorithm Application, 7(3), 78-83. https://doi.org/10.54097/40vtz891