Function-Controllable Protein Generation: Methods and Evaluation

Authors

  • Junyuan Zhou

DOI:

https://doi.org/10.54097/3qvn7n30

Keywords:

Protein design, Diffusion models, Ligand conditioning, Domain insertion, Functional controllability.

Abstract

Deep learning promotes design of proteins from simply foldable to operationally functional. It propels the generative design with programmable functionality in synthetic biology and drug discovery. This article presents three research methods. FoldingDiff performs the diffusion modelling in folding angle space, where complex topologies can be generated and the burden of equivariance alleviated. LigandMPNN method can accurately model 3D ligand surroundings. Therefore, it can obtain high sequence recovery rate and interface metrics on the specifically selected test sets. ProDomino method predicts insertion sites and can make verified uses light or small molecule-controlling switches in a lot of systems. All methods have restrictions. The essay has explored solving methods such as use the ProteinMPNN method to advance second stage screening. By furthering the physical validation and the expansion of complexed ligands, hereby indicates, with the following additional methods the physical validation and ligand complexes expansion. PinMyMetal method enables possible high-precision of transition metal pockets with the high-precision. The DIP-seq method can locate insertion hotspots with high tolerance. The boundaries between foldable proteins and the programmable functional proteins are closing with all improvements. As shown, this means that it can provide reproducible and comparable engineering strategies for drug development, synthetic biology and materials design.

References

[1] Wu K E, Yang K, Van den Berg R, Alamdari S, Zou J Y, Lu A X, Amini A P. Protein structure generation via folding diffusion. Nature Communications, 2024, 15: 1059.

[2] Dauparas J, Lee G R, Pecoraro R., et al. atomic context-conditioned protein sequence design using LigandMPNN. Nature Methods, 2025, 22: 717 – 723.

[3] Wolf B, Shehu P, Brenker L, Von Bachmann A, Kroell A S, Southern N, Holderbach S, Eigenmann J, Aschenbrenner S, Mathony J, Niopek D. Rational engineering of allosteric protein switches by in silico prediction of domain insertion sites. Nature Methods, 2025, 15: 412.

[4] Butten Schoen M, Morris G M, Deane C M. PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences (PoseBusters). Chemical Science, 2024, 15: 3130 – 3139.

[5] Bryant P, Kelkar A, Guljas A, Clementi C, Noé F. Structure prediction of protein-ligand complexes from sequence information with Umol. Nature Communications, 2024, 15: 4536.

[6] Jänes J, Beltrao P. Deep learning for protein structure prediction and design—progress and applications. Molecular Systems Biology, 2024, 20 (3): 162 – 169.

[7] Khakzad H, Igashov I, Schneuing A, Goverde C, Bronstein M, Correia B. A new age in protein design empowered by deep learning. Cell Systems, 2023, 14 (11): 925 – 946.

[8] Bennett N R, Coventry B, Goreshnik I, Huang B, Allen A, Vafeados D, Peng Y P, Dauparas J, Baek M, Stewart L, DiMaio F, De Munck S, Savvides S N, Baker D. Improving de novo protein binder design with deep learning. Nature Communications, 2023, 14: 2625.

[9] Zhang H, Zhong J, Gucwa M, Zhang Y, Ma H, Deng L, Mao L, Minor W, Wang N, Zheng H. PinMyMetal: a hybrid learning system to accurately model transition metal binding sites in macromolecules. Nature Communications, 2025, 16: 3043.

[10] Coyote-Maestas W, He Y, Myers C L, Schmidt D. Domain insertion permissibility-guided engineering of allostery in ion channels. Nature Communications, 2019, 10: 290.

Downloads

Published

15-03-2026

Issue

Section

Articles

How to Cite

Zhou, J. (2026). Function-Controllable Protein Generation: Methods and Evaluation. Mathematical Modeling and Algorithm Application, 9(1), 479-482. https://doi.org/10.54097/3qvn7n30