Data set analysis of Titanic distress data

Authors

  • Linghan Gao

DOI:

https://doi.org/10.54097/whp21y56

Keywords:

Titanic dataset, K-Nearest Neighbors, cox proportional hazards model, cumulative hazard function.

Abstract

The main purpose of this paper is to study the sinking of Titanic, and the Titanic data set, which is open source on kaggle, is the background support resource for this research. This paper makes use of random Forest and Cox proportional risk models as well as survival and cumulative risk functions, which have been carefully calibrated and calibrated accordingly, so as to analyze in detail the factors affecting the survival of passengers on Titanic and what allowed them to survive. It's the class of shipping space or the port of departure or the family and friends you're bringing with you. These are all necessary factors that will affect the survival of passengers. Through the corresponding code display of the open-source data set, this paper draws the corresponding conclusion and finds that the factors of passenger survival have a relatively large relationship and considerable impact on fare and berth level.

Downloads

Download data is not yet available.

References

Yang Wanyu, Statistics and analysis of vessel navigation accidents Statistics and analysis of vessel navigation accidents. China water transport, 2021, (05): 28 - 30.

Kaggle. itanic - machine learning from disaster. Titanic - Machine Learning from Disaster. https://www.kaggle.com/competitions/titanic.

AIGC. Decision trees and random Forest examples: Titanic survival problem. 2023.

Ekinci, Ekin & Omurca, Sevinc & Acun, Neytullah. A Comparative Study on Machine Learning Techniques Using Titanic Dataset. 2018.

Cui Chul sen, Wu Jin ran. Application research of data classification based on random forest. Journal of Shanxi Datong University: Natural Science Edition, 2019, (5): 31 - 33, 39.

Donges, N. Predicting the survival of Titanic passengers. Medium. 2018.

Vinothan. Titanic model with 90% accuracy. 2018.

Mukhija, S. A beginner’s guide to kaggle’s titanic problem. Medium. 2019.

Downloads

Published

10-04-2024

How to Cite

Gao, L. (2024). Data set analysis of Titanic distress data. Highlights in Science, Engineering and Technology, 92, 323-329. https://doi.org/10.54097/whp21y56