Application of K-nearest neighbors in protein-protein interaction prediction
DOI:
https://doi.org/10.54097/hset.v2i.564Keywords:
K-Nearest Neighbor, Conjoint Triad, Auto Covariance, Local Descriptor, Protein-protein interactionAbstract
Protein-protein interactions (PPIs) are an important part of many life processes in organisms. Almost all life processes are related to protein-protein interactions, and the study of protein interactions plays an important role in revealing the mysteries of life activities. In order to improve the prediction performance of protein-protein interaction, we are based on K-Nearest Neighbor (KNN), combined with protein sequence coding methods such as Conjoint Triad (CT), Auto Covariance (AC) and Local Descriptor (LD) to construct KNN-CT, KNN-AC and KNN-LD three prediction models of PPIs. The results show that the prediction models KNN-CT and KNN-AC have obtained accuracy rates of 94.29% and 94.69%, respectively, which are better than existing methods. The results show that K-nearest neighbors can be a useful complement to protein-protein interactions.
Downloads
References
UETZ P, Giot L, CAGNEY G, MANSFIELD T A, et al. A Comprehensive Analysis of Protein-protein Interactions in Saccharomyces Cerevisiae. Nature, 2000, 403(6770):623-627.
LA COUNT DJ, VIGNALI M, CHETTIER R, et al. A Protein Interaction Network of the Malaria Parasite Plasmodium Falciparum. Nature, 2005, 438(7064):103-107.
PARRISH J R, Yu J, LIU G, et al. A Proteome-wide Protein Interaction Map for Campylobacter Jejuni. Genome Biol., 2007, 8(7): R130.
CHATTERJEE P, BASU S, KUNDU M, et al. Prediction of Protein-Protein Interactions Using Machine Learning, Domain-Domain Affinities and Frequency Tables. Cell Mol. Biol. Lett., 2011, 16: 264-278.
RASHID M, RAMASAMY S, RAGHAVA G P, et al. A Simple Approach for Predicting Protein-Protein Interactions. Curr. Protein Pept. Sci., 2010, 11: 589-600.
DOHKAN S, KOIKE A, TAKAGI T, et al. Improving the Performance of an SVM-Based Method for Predicting Protein-Protein Interactions. Silico Biol., 2006, 6: 515-529.
FARISELLI P, PAZOS F, VALENCIA A, CASADIO R, et al. Prediction of Protein-Protein Interaction Sites in Heterocomplexes with Neural Networks.Eur. J. Biochem., 2002, 269: 1356-1361.
VALENTE G T, ACENCIO M L, MARTINS C, et al. The Development of a Universal in Silico Predictor of Protein-Protein Interactions. PLoS One, 2013, 8(5): e65587.
CHEN X W, LIU M. Prediction of Protein-Protein Interactions Using Random Decision Forest Framework. Bioinformatics, 2005, 21(24): 4394-4400.
SAHA I, ZUBEK J, KLINGSTRÖM T, et al. Ensemble Learning Prediction of Protein-Protein Interactions Using Proteins Functional Annotations. Molecular Biosystems, 2014, 10(4): 820-830.
QI Y, KLEIN-SEETHARAMAN J, BAR-JOSEPH Z. Random Forest Similarity for Protein-Protein Interaction Prediction from Multiple Sources. Pac. Symp. Biocomput, 2015, 10: 531-542.
GUO Y, YU L, WEN Z, et al. Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences. Nucleic acids research, 2008,36(9): 3025–3030.
YANG L, XIA J F, GUI J. Prediction of protein-protein interactions from protein sequence using local de scriptors. Protein and Peptide Letters, 2010, 17(9): 1085–1090.
COVER T M, HART P E, et al. Nearest neighbor pattern classification. IEEE transactions on information theory, 1967, 13(1): 21–27.
Liu Z G, Pan Q, Dezert J. A New Belief-Based K-Nearest Neighbor Classification Method. Pattern Recognition, 2013, 48(3): 834-844.
Su M C, Chou C H. A Modified Version of the k-Means Algorithm with Distance Based on Cluster Symmetry[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6): 674-680.
Tian J, Li M Q, Chen F Z, et al. Coevolutionary Learning of Neural Network Ensemble for Complex
Classification Tasks. Pattern Recognition, 2012, 45(4): 1373-1385.
Shen J, Zhang J, Luo X, Zhu W, Yu K, Chen K, et al. Predicting protein-protein interactions based only on sequences information. Proc. Natl Acad. Sci. 2007; 104 (11): 4337-4341.
SUN T L, ZHOU B, LAI H H, et al. Sequence-Based Prediction of Protein Protein Interaction Using a Deep-Learning Algorithm. Bmc Bioinformatics, 2017, 18(1): 277-285.
DAVIES M N, SECKER A, FREITAS A A, et al. Optimizing Amino Acid Groupings for GPCR Classification. Bioinformatics, 2008, 24(18):1980-1986.
TONG J C, TAMMI M T. Prediction of Protein Allergenicity Using Local Description of Amino Acid Sequence. Front. Biosci., 2008, 13(16): 6072-6078.
Van LT, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics. 2011; 27 (21): 3036-3043.
SHEN J M, ZHANG J, LUO X M, et al. Predicting Protein-Protein Interactions Based Only on Sequences Information. Proc. Natl Acad. Sci., 2007, 104 (11): 4337-4341.
YOU Z H, LI S, GAO X, LUO X, et al. Large-scale Protein-Protein Interactions Detection by Integrating Big Biosensing Data with Computational Model. Biomed Res Int., 2014, 2014(2):598129.
ZHOU Y Z, GAO Y, ZHENG Y Y. Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence. Adv. Comput. Sci. Edu. Appl., 2011, 202: 254-262.
GUO Y Z, LI M L, PU X M, et al. PRED_PPI: A Server for Predicting Protein-Protein Interactions Based on Sequence Data with Probability Assignment. Bmc Research Notes, 2010, 3(1): 145-152.
DU X Q, SUN S W, HU C L, et al. DeepPPI: Boosting Prediction of Protein-Protein Interactions with Deep Neural Networks. Journal of Chemical Information & Modeling, 2017, 57 (6):1499-1510.
Zhang YN, Pan XY, Huang Y, et al. Adaptive compressive learning for prediction of protein-protein interactions from primary sequence, Journal of Theoretical Biology, 2011; 283(1):44-52. pmid: 21635901.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







