Lung Cancer Prediction based on KNN, Logistic Regression, and Random Forest Algorithm

Authors

  • Yunzhe Liao

DOI:

https://doi.org/10.54097/vh9rws94

Keywords:

Lung cancer, Prediction model, Evaluation, Machine learning.

Abstract

Cancer has become the biggest killer of human health in the 21st century, among them, lung cancer is the "number one killer" of cancer ", and has become the disease with the highest incidence rate of cancer. However, many factors cause lung cancer. Therefore, to effectively control cancer and reduce the incidence rate of lung cancer, it is very important to predict the incidence of lung cancer caused by different factors. By calling the dataset in Kaggle, we can visualize and preprocess data from 25 factors that may cause lung cancer incidence, and use the machine of K-nearness model, random forest model and logical regression model Learn the model for prediction and obtain the model with the highest accuracy. It was found that the random forest algorithm performed the best among the three models used, so the random forest algorithm was used for model tuning, adjusting parameters to improve accuracy. Finally, the model accuracy was increased from 0.96 to 0.98, achieving the expected effect. This study is beneficial for a clearer and more accurate understanding of the impact of different factors on the incidence of lung cancer, as well as for more efficient prediction of lung cancer in the medical field.

Downloads

Download data is not yet available.

References

Fang Yiliang. With the continuous growth of the global economy, the significant improvement of living standards, the fundamental improvement of medical conditions, and the steady extension of peacetime, the average life expectancy of humanity has achieved unprecedented upward space. However, even so, the century old people, who can be regarded as "auspicious", still belong to the rare category, which can be said to be [J] [2023-12-04].

Zhao Chenghui. Carcinogenic factors in daily life [J]. Medicine and Health, 2007, 15 (2): 1. DOI: CNKI: SUN: TAKE.0.2007 - 02 - 016.

Liang Zhongshi .au@sio2 Preparation of nanocomposites and their application in photothermal therapy for liver cancer [D]. East China Normal University [2023-12-04] DOI: 10.76666/d. y1904196.

Zhu Meng, Cheng Yang, Dai Juncheng, et al. A predictive model for lung cancer risk in the Chinese population based on genome-wide association studies [J]. Chinese Journal of Epidemiology, 2015, 36 (10): 6. DOI: 10.3760/cma.j.issn.0254 - 6450 - 2011.10002.

Zhao Boyang. Prediction of Top Ten Common Cancer Incidence Trends [J]. Wishing You Good Health, 1995 (3): 2. DOI: CNKI: SUN: ZNJK. 0.1995 - 03 - 041.

Cheng Xuexin. Research on Particle Swarm Optimization Weighted Random Forest Algorithm [D]. Zhengzhou University [2023-12-04]. DOI: CNKI: CDMD: 21017. 146473.

Gu Xiaoping, Wang Yincun, Zhi Hengkui, et al. The impact and interaction of smoking and drinking on the incidence of lung cancer [J]. Jiangsu Preventive Medicine, 2015, 26 (5): 3. DOI: 10.13668/j.issn.1006 - 9070. 2015. 05. 013.

Downloads

Published

10-04-2024

How to Cite

Liao, Y. (2024). Lung Cancer Prediction based on KNN, Logistic Regression, and Random Forest Algorithm. Highlights in Science, Engineering and Technology, 92, 280-287. https://doi.org/10.54097/vh9rws94