Financial Statement Fraud Detection based on Integrated Feature Selection and Imbalance Learning
DOI:
https://doi.org/10.54097/fbem.v8i3.7557Keywords:
Financial statement fraud, Fraud detection, Feature selection.Abstract
Based on the data mining technology, this paper proposes an integrated feature selection method to construct the financial statement fraud detection feature system of listed companies, uses the SMOTE algorithm to solve the class unbalanced distribution problem, and combines the machine learning algorithm models to construct the financial statement fraud detection model. Based on the real financial statement data of Chinese listed companies, the empirical analysis is conducted to provide support for the auditors. The integrated feature selection framework proposed in this paper improves the problem of poor generalization of the single feature selection method, and the SMOTE effectively strengthens and improves the ability of the model to detect financial statement fraud of listed companies.
Downloads
References
Albrecht W, Romney M. Red-flagging management fraud: A validation. Advances in Accounting, 1986, 3: 323-333.
Kinney W R, McDaniel L S. Characteristics of firms correcting previously reported quarterly earnings. Journal of Accounting and Economics, 1989, 11(1): 71-93.
Gozman D, Currie W. The role of Investment Management Systems in regulatory compliance: a Post-Financial Crisis study of displacement mechanisms. Journal of Information Technology, 2014, 29(1): 44-58.
Persons O S. Using financial statement data to identify factors associated with fraudulent financial reporting. Journal of Applied Business Research, 1995, 11(3): 38.
Beasley M. An Empirical Analysis of the Relation between the Board of Director Composition and Financial Statement Fraud. The Accounting Review, 1996, 71(4): 443-465.
Hu L, Gao W, Zhao K, et al. Feature selection considering two types of feature relevancy and feature interdependency. Expert Systems with Applications, 2018, 93: 423-434.
Ravisankar P, Ravi V, Raghava Rao G, et al. Detection of financial statement fraud and feature selection using data mining techniques. Decision Support Systems, 2011, 50(2): 491-500.
Cheng C-H, Kao Y-F, Lin H-P. A financial statement fraud model based on synthesized attribute selection and a dataset with missing values and imbalanced classes. Applied Soft Computing, 2021, 108(3): 107487.
Breiman, L. Random Forests. Machine Learning ,2001, 45: 5–32.
Friedman J. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 2001, 29(5): 1189-1232.
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16(1): 321–357.








