Investigation of robustness and numerical stability of multiple regression and PCA in modeling world development data
DOI:
https://doi.org/10.54097/hset.v16i.2503Keywords:
Multiple regression; PCA; world development data.Abstract
Popular methods for modeling data both labelled and unlabeled, multiple regression and PCA has been used in research for a vast number of datasets. In this investigation, we attempt to push the limits of these two methods by running a fit on world development data, a set notorious for its complexity and high dimensionality. We assess the robustness and numerical stability of both methods using their matrix condition number and ability to capture variance in the dataset. The result indicates poor performance from both methods from a numerical standpoint, yet certain qualitative insights can still be captured.
Downloads
References
Howard Kung and Lukas Schmid. Innovation, growth, and asset prices. The Journal of Finance, 70(3):1001–1037, 2015.
The World Bank. World development indicators, 1900-2021. Data retrieved from http://data.worldbank.org/indicator.
Jan-Michael Becker, Christian M Ringle, Marko Sarstedt, and Franziska V¨olckner. How collinearity affects mixture regression results. Marketing Letters, 26(4):643–659, 2015.
Buldygin, V.V., Kozachenko, Y.V. Sub-Gaussian random variables. Ukr Math J 32, 483–489 (1980). https://doi.org/10.1007/BF01087176
Osborne, J. (2010). Improving your data transformations: Applying the Box-Cox transformation. Practical Assessment, Research, and Evaluation, 15(1), 12.
Goldberger, A. S. (1964). Econometric theory. Econometric theory.
Hanson, T. (2010). Multiple regression.
Holland, S. M. (2008). Principal components analysis (PCA). Department of Geology, University of Georgia, Athens, GA, 30602-2501.
Belsley, David A.; Kuh, Edwin; Welsch, Roy E. (1980). "The Condition Number". Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley & Sons. pp. 100–104. ISBN 0-471-05856-4.
Bohrnstedt, G. W., & Carter, T. M. (1971). Robustness in regression analysis. Sociological methodology, 3, 118-146.
Raymond B. Cattell (1966) The Scree Test For The Number Of Factors, Multivariate Behavioral Research, 1:2, 245-276, DOI: 10.1207/s15327906mbr0102_10
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







