Multivariate Linear Regression Modeling for Influencing Factors of Fetal Y-Chromosome Concentration in Pregnant Women
DOI:
https://doi.org/10.54097/ahg41748Keywords:
Y-chromosome concentration, multivariate linear regression, noninvasive prenatal testing, generalized additive model, cubic spline, gestational age, body mass indexAbstract
Noninvasive prenatal testing (NIPT) screens for chromosomal abnormalities by analyzing cell free fetal DNA in maternal peripheral blood, and the accurate quantification of fetal Y-chromosome concentration in male pregnancies is a decisive indicator of test validity. However, individual variability among pregnant women is substantial, so gestational age, body mass index (BMI), maternal age, and several other factors jointly modulate Y-chromosome concentration, which makes a single threshold rule inadequate across heterogeneous populations. The present study addresses the joint analysis of factors influencing Y-chromosome concentration by constructing a modeling framework centered on multivariate linear regression. The framework unifies a basic linear structure, generalized additive smoothing terms, and cubic spline basis expansions within a single estimation pipeline, and incorporates Pearson correlation and Spearman rank correlation, the F test, and the Akaike information criterion for variable screening and significance testing. Evaluation is carried out on a clinical NIPT dataset of 1082 records covering 12 obstetric indicators. On a held out test partition the proposed method attains a coefficient of determination of 0.687, a mean absolute error of 0.018, and a root mean square error of 0.024, which corresponds to an improvement of approximately 11.4% in the coefficient of determination over a purely linear baseline. The estimated regression coefficients indicate that gestational age exerts a significant positive effect on Y-chromosome concentration, that BMI exerts a significant negative modulating effect, and that maternal age, height, weight, and overall GC content contribute smaller yet statistically significant marginal effects when combined in the joint model. The proposed framework provides a quantitative basis for personalized decisions about the optimal NIPT testing window and is therefore relevant for improving the accuracy and timeliness of prenatal screening.
Downloads
References
[1] Y. M. D. Lo, N. Corbetta, P. F. Chamberlain, V. Rai, I. L. Sargent, C. W. G. Redman, and J. S. Wainscoat, Presence of fetal DNA in maternal plasma and serum, The Lancet, vol. 350, no. 9076, pp. 485-487, 1997.
[2] R. W. K. Chiu, K. C. A. Chan, Y. Gao, V. Y. M. Lau, W. Zheng, T. Y. Leung, C. H. F. Foo, B. Xie, N. B. Y. Tsui, F. M. F. Lun, B. C. Y. Zee, T. K. Lau, C. R. Cantor, and Y. M. D. Lo, Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma, Proceedings of the National Academy of Sciences, vol. 105, no. 51, pp. 20458-20463, 2008.
[3] E. Wang, A. Batey, C. Struble, T. Musci, K. Song, and A. Oliphant, Gestational age and maternal weight effects on fetal cell-free DNA in maternal plasma, Prenatal Diagnosis, vol. 33, no. 7, pp. 662-666, 2013.
[4] G. Ashoor, A. Syngelaki, L. C. Y. Poon, J. C. Rezende, and K. H. Nicolaides, Fetal fraction in maternal plasma cell free DNA at 11 to 13 weeks gestation: relation to maternal and fetal characteristics, Ultrasound in Obstetrics and Gynecology, vol. 41, no. 1, pp. 26-32, 2013.
[5] N. L. Vora, K. L. Johnson, S. Basu, P. M. Catalano, S. Hauguel-De Mouzon, and D. W. Bianchi, A multifactorial relationship exists between total circulating cell free DNA levels and maternal BMI, Prenatal Diagnosis, vol. 32, no. 9, pp. 912-914, 2012.
[6] D. W. Bianchi and R. W. K. Chiu, Sequencing of circulating cell free DNA during pregnancy, The New England Journal of Medicine, vol. 379, no. 5, pp. 464-473, 2018.
[7] K. Pearson, Notes on regression and inheritance in the case of two parents, Proceedings of the Royal Society of London, vol. 58, pp. 240-242, 1895.
[8] C. Spearman, The proof and measurement of association between two things, American Journal of Psychology, vol. 15, no. 1, pp. 72-101, 1904.
[9] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, New York, 2nd edition, 2009.
[10] S. N. Wood, Generalized Additive Models: An Introduction with R, CRC Press, Boca Raton, 2nd edition, 2017.
[11] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716-723, 1974.
[12] M. E. Norton, B. Jacobsson, G. K. Swamy, L. C. Laurent, A. C. Ranzini, H. Brar, M. W. Tomlinson, L. Pereira, J. L. Spitz, D. Hollemon, H. Cuckle, T. J. Musci, and R. J. Wapner, Cell free DNA analysis for noninvasive examination of trisomy, New England Journal of Medicine, vol. 372, no. 17, pp. 1589-1597, 2015.
[13] I. Hudecova, D. Sahota, M. M. S. Heung, T. Y. Jin, W. K. J. Lee, T. Y. Leung, Y. M. D. Lo, and R. W. K. Chiu, Maternal plasma fetal DNA fractions in pregnancies with low and high risks for fetal chromosomal aneuploidies, PLoS ONE, vol. 9, no. 2, e88484, 2014.
[14] J. A. Canick, G. E. Palomaki, E. M. Kloza, G. M. Lambert-Messerlian, and J. E. Haddow, The impact of maternal plasma DNA fetal fraction on next generation sequencing tests for common fetal aneuploidies, Prenatal Diagnosis, vol. 33, no. 7, pp. 667-674, 2013.
[15] R. P. Rava, A. Srinivasan, A. J. Sehnert, and D. W. Bianchi, Circulating fetal cell free DNA fractions differ in autosomal aneuploidies and monosomy X, Clinical Chemistry, vol. 60, no. 1, pp. 243-250, 2014.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







