Prediction of Protein Secondary Structure Using a Hybrid Convolutional Blocks with GRU Units

Authors

  • Shiwei Yang
  • Xiaozhou Chen

DOI:

https://doi.org/10.54097/k8rmxp27

Keywords:

Convolutional Block, GRUs, Secondary Structure

Abstract

In order to better understand the function of proteins, protein research in the field of bioinformatics has always been an important issue, and protein structure prediction is also one of the research topics. Starting from the one-dimensional structure (amino acid sequence), it is a good method to predict and classify the secondary structure and adopt deep neural networks to solve sequence problems. We used convolutional blocks combined with gated recurrent units (GRUs) to predict protein secondary structure. Different from previous convolutional neural networks architectures, in this study, we used a mixture of two convolutional blocks of different scales combined with GRUs for sequence labeling for protein secondary structure prediction. Additionally, considering that the coding format in previous studies was too simple, this experiment also added the physical and chemical properties and the logarithmic relative probability of 8 types of secondary structure amino acids as the feature input, which improved the accuracy after the addition of features. Experiments show that our model has good performance in Q8 accuracy.

Downloads

Download data is not yet available.

References

[1] M. Zamani and S. C. Kremer, "Protein secondary structure prediction through a novel framework of secondary structure transition sites and new encoding schemes," 2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Chiang Mai, Thailand, 2016, pp. 1-7. DOI: https://doi.org/10.1109/CIBCB.2016.7758118

[2] Yuedong Yang, Jianzhao Gao, Jihua Wang, Rhys Heffernan, Jack Hanson, Kuldip Paliwal, Yaoqi Zhou, Sixty-five years of the long march in protein secondary structure prediction: the final stretch?, Briefings in Bioinformatics, Volume 19, Issue 3, May 2018, Pages 482–494.

[3] Y, Y., J, G. & J., W. Sixty-five years of the long march in protein secondary structure prediction: the final stretch. Briefings Bioinforma. 19, 482–494.

[4] Martin E. M. Noble et al.Protein Kinase Inhibitors: Insights into Drug Design from Structure.Science303,1800-1805 (2004). DOI: https://doi.org/10.1126/science.1095920

[5] Srinivasa, K. G., Siddesh, G. M. & Manisekhar, S. Statistical modelling and machine learning principles for bioinformatics techniques, tools, and applications (Springer Nature, 2020). DOI: https://doi.org/10.1007/978-981-15-2445-5

[6] Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolym. Orig. Res. on Biomol. 22, 2577–2637, (1983). DOI: https://doi.org/10.1002/bip.360221211

[7] Shutong Yang, Yuhong Wang, Kennie Cruz-Gutierrez et al. Localnet: A Simple Recurrent Neural Network Model for Protein Secondary Structure Prediction Using Local Amino Acid Sequences Only, 07 January 2021, PREPRINT (Version 1) available at Research Square. DOI: https://doi.org/10.21203/rs.3.rs-139322/v1

[8] Jiyun, Z., Hongpeng, W. & Z., Z. Cnnh_pss: protein 8-class secondary structure prediction by convolutional neural network with highway. BMC bioinformatics 19, 99–109, (2018). DOI: https://doi.org/10.1186/s12859-018-2067-8

[9] Lina, Y., Pu, W., Zhong, X, L. & Y., T. Y. Protein structure prediction based on bn-gru method. Int. J. Wavelets, Multiresolution Inf. Process. 18, 2050045, (2020). DOI: https://doi.org/10.1142/S0219691320500459

[10] L, W. A. & L., M. S. Reasons for the occurrence of the twenty coded protein amino acids. J. Mol. Evol. 17, 273–284, (1981). DOI: https://doi.org/10.1007/BF01795749

[11] Ashraf, Y. & Li, Y. Template-based c8-scorpion: a protein 8-state secondary structure prediction method using structural information and context-based features. BMC bioinformatics 15, 1–8, (2014). DOI: https://doi.org/10.1186/1471-2105-15-S8-S3

[12] Zhao, Y. & Liu, Y. Oclstm: Optimized convolutional and long short-term memory neural network model for protein secondary structure prediction. Plos one 16, e0245982, DOI: https://doi.org/10.1371/journal.pone.0245982 (2021). DOI: https://doi.org/10.1371/journal.pone.0245982

[13] Cuff, J. A. & Barton, G. J. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins: Struct. Funct. Bioinforma. 34, 508–519, (1999). DOI: https://doi.org/10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4

[14] Tomasz, S. & K., R.-K. I. S. Protein secondary structure prediction: a review of progress and directions. Curr. Bioinforma. 15, 90–107, (2020). DOI: https://doi.org/10.2174/1574893614666191017104639

[15] F, A. S., M, G. E., R., A., A., S. A. & Yu, Y. K. Psi-blast pseudocounts and the minimum description length principle. Nucleic acids research 37, 815–824, (2009). DOI: https://doi.org/10.1093/nar/gkn981

[16] F, A. S. et al. Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic acids research 25, 3389–3402, (1997). DOI: https://doi.org/10.1093/nar/25.17.3389

[17] Xiaoyang, J., Qiwen, D., D, H. & R., L. Amino acid encoding methods for protein sequences: a comprehensive review and assessment. IEEE/ACM transactions on computational biology bioinformatics 17, 1918–1931, (2019). DOI: https://doi.org/10.1109/TCBB.2019.2911677

[18] Michael, H., A, E. & Wang, Y. Modeling aspects of the language of life through transfer-learning protein sequences. BMC bioinformatics 20, 1–17, (2019). DOI: https://doi.org/10.1186/s12859-019-3220-8

[19] Peng, W. S., Ma, J., J & J., X. Protein secondary structure prediction using deep convolutional neural fields. Sci. reports 6, 1–11, (2016). DOI: https://doi.org/10.1038/srep18962

[20] Zeming, L., J, L. & Y., Q. Must-cnn: a multilayer shift-and-stitch deep convolutional architecture for sequence-based protein structure prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, (2016). DOI: https://doi.org/10.1609/aaai.v30i1.10007

[21] Rosenberg, J. A., Sønderby, C. K., Sønderby, S. K. & Winther, O. Deep recurrent conditional random field network for protein secondary prediction. In Proceedings of the 8th ACM international conference on bioinformatics, computational biology, and health informatics, 73–78, (2017). DOI: https://doi.org/10.1145/3107411.3107489

[22] Michael, S. V., Z., S. & A., A. Predicting secondary structure of protein using hybrid of convolutional neural network and support vector machine. Int. J. Intell. Eng. & Syst. 14, (2022). DOI: https://doi.org/10.22266/ijies2021.0228.23

Downloads

Published

29-12-2024

Issue

Section

Articles

How to Cite

Yang, S., & Chen, X. (2024). Prediction of Protein Secondary Structure Using a Hybrid Convolutional Blocks with GRU Units . Frontiers in Computing and Intelligent Systems, 10(3), 23-30. https://doi.org/10.54097/k8rmxp27