Titanic Survival Prediction Based on Machine Learning Algorithms

Authors

  • Yuming Zhang

DOI:

https://doi.org/10.54097/mwkr1a24

Keywords:

Data preprocessing, logistic regression, random forest model.

Abstract

This report is aimed at demonstrating the application of machine learning techniques for predicting the survival of passengers who boarded the Titanic. After analyzing the Titanic dataset, which includes variables Pclass, Sex, Age, SibSp, Parch, Fare, Ticket, and Cabin, two machine learning algorithms, Logistic Regression and Random Forests Model, are used to give survival predictions. Models are compared to find an accuracy difference, and the magnitudes that each factor has on survival are also identified. Data preprocessing is the essential technique that will be used to adjust the data set. Before this process, correlations between variables are analyzed to give directions for feature engineering. And for feature engineering, data conversion and vacancy filling are first implemented. Afterwards, features are selected while new features are gained for model implementation. In understanding the final outputs of the model, new features, like name length, combine to give insight into more implicit survival factors that have been previously ignored.

Downloads

Download data is not yet available.

References

Haque A, Shivaprasad G, Guruprasad G. Passenger data analysis of Titanic using machine learning approach in the context of chances of surviving the disaster. IOP Conference Series: Materials Science and Engineering, 2021, 1065(1): 12042.

Sherlock J, et al. Classification of Titanic passenger data and chances of surviving the disaster. Cornell University, 2018.

Singh K, Nagpal R, Sehgal R. Exploratory data analysis and machine learning on Titanic Disaster Dataset. IEEEXPLORE, 2020.

Tabbakh A, Rout J K, Rout M. Analysis and prediction of the survival of Titanic passengers using machine learning. In Lecture notes in networks and systems, 2020, 297-304.

Shekhar S, Arora D, Sharma P. Classifying Titanic Passenger Data and Prediction of Survival from Disaster. In Lecture notes in networks and systems, 2020, 181-187.

Kakde Y, Agrawal S. Predicting survival on Titanic by applying exploratory data analytics and machine learning techniques. International Journal of Computer Applications, 2018, 179 (44): 32-38.

Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning Proc. Int. Conf. on Science and Information (London), 2014, 372-378.

Guyon I, Elisseeff A. An Introduction to Feature Extraction (Feature Extraction. Studies in Fuzziness and Soft Computing Vol 207 Springer, Berlin, Heidelberg), 2006.

Heaton J. An empirical analysis of feature engineering for predictive modeling 2016 Proc. SoutheastCon (Norfolk VA), 2016, 1-6.

Ao Huajian, Yu Kaichao. Design of the Titanic Survival Prediction Model. Information Technology, 2023, 47(5): 6-12.

Downloads

Published

15-08-2024

How to Cite

Zhang, Y. (2024). Titanic Survival Prediction Based on Machine Learning Algorithms. Highlights in Science, Engineering and Technology, 107, 189-195. https://doi.org/10.54097/mwkr1a24