Application and Challenges of Statistical Methods in Biological Genetics

Authors

  • Jingyi Sun

DOI:

https://doi.org/10.54097/hset.v40i.6519

Keywords:

Genome testing, Lasso Regression, Chen-Stein Method, Logistic Regression.

Abstract

Humans are curious about genes, from plants to animals, from breeding to diseases. For centuries, it has been considered a genetic disease. With the development of medicine, people have also realized that many diseases are heritable. With the birth of modern statistics, humans have created many models. This article focuses on the application of statistical methods in biological genetics. This paper introduces the principles and their applications of Least Absolute Shrinkage and Selection Operator Regression, the Chen-Stein Method, and Logical Regression model in different branches, such as gene set selection. These models can effectively tackle the problem of reproducibility in genetics to a certain extent when used correctly. In addition, they offer an effective means of data analysis in genetics field. Although the three models have their weaknesses, such as the use and selection of a priori, it is reasonable to believe that with the continuous improvement of the models by mathematicians, they can have better prospects.

Downloads

Download data is not yet available.

References

R. Tibshirani, "Regression shrinkage and selection via the lasso," Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58 (1): 267 - 288.

S. L. Kukreja, J. Löfberg, M. J. Brenner, "A least absolute shrinkage and selection operator (LASSO) for nonlinear system identification," IFAC proceedings volumes, 2006, 39 (1): 814 - 819.

J. Ranstam, J. A. Cook, "LASSO regression," British Journal of Surgery, Volume 105, Issue 10, September 2018, Page 1348.

C. Stein, "The Invariant, the Direct and the" Pretentious"," Creative Minds, Charmed Lives: Interviews at Institute for Mathematical Sciences, National University of Singapore. 2010. 282 - 287.

L. H. Y. Chen, "Poisson approximation for dependent trials," The Annals of Probability, 1975, 3 (3): 534 - 545.

K. Lange, "Mathematical and statistical methods for genetic analysis," New York: Springer, 2002.

R. Arratia, L. Goldstein, L. Gordon, "Poisson approximation and the Chen-Stein method," Statistical Science, 1990: 403 - 424.

Wilson, R. Jeffrey, and Kent A. Lorenz. "Short history of the logistic regression model." Modeling Binary Correlated Responses using SAS, SPSS and R. Springer, Cham, 2015. 17 - 23.

Bender, Ralf, and Ulrich Grouven. "Ordinal logistic regression in medical research." Journal of the Royal College of physicians of London 31.5 (1997): 546.

Allison, B. David, et al. "Microarray data analysis: from disarray to consolidation and consensus." Nature reviews genetics 7.1 (2006): 55 - 65.

Khatri, Purvesh, Marina Sirota, and Atul J. Butte. "Ten years of pathway analysis: current approaches and outstanding challenges." PLoS computational biology 8.2 (2012): e1002375.

Barrett, Tanya, et al. "NCBI GEO: archive for functional genomics data sets—update." Nucleic acids research 41.D1 (2012): D991 - D995.

Visscher, Peter M., et al. "Five years of GWAS discovery." The American Journal of Human Genetics 90.1 (2012): 7 - 24.

J. H. Friedman, T. Hastie, R. Tibshirani, "Regularization paths for generalized linear models via coordinate descent," J. Stat. Softw.2010; 33: 1 – 22.

R. Tibshirani, "Regression shrinkage and selection via the Lasso," J. R. Stat. Soc. B (Stat. Methodol.). 1996; 58: 267 – 288.

A. Javanmard, A. Montanari, "Model selection for high-dimensional regression under the generalized irrepresentability condition," Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013; Curran Associates Inc. 3012 – 3020.

D'Eustachio, Peter, and Frank H. Ruddle. "Somatic cell genetics and gene families." Science 220.4600 (1983): 919 - 924.

K. Lange, "Mathematical and statistical methods for genetic analysis," New York: Springer, 2002.

T. M. Goradia, K. Lange, "Applications of coding theory to the design of somatic cell hybrid panels," Mathematical biosciences, 1988, 91 (2): 201 - 219.

A. R. Rushton, "Quantitative analysis of human chromosome segregation in man-mouse somatic cell hybrids." Cytogenetic and Genome Research 17.5 (1976): 243 - 253.

J. G. Liao, Khew-Voon Chin, "Logistic regression for disease classification using microarray data: model selection in a large p and small n case," Bioinformatics, Volume 23, Issue 15, August 2007, Pages 1945 - 1951.

Segaert, Pieter, et al. "Robust identification of target genes and outliers in triple-negative breast cancer data." Statistical methods in medical research 28.10 - 11 (2019): 3042 - 3056.

Downloads

Published

29-03-2023

How to Cite

Sun, J. (2023). Application and Challenges of Statistical Methods in Biological Genetics. Highlights in Science, Engineering and Technology, 40, 43-49. https://doi.org/10.54097/hset.v40i.6519