A Review of Research on AI-Assisted Code Generation and AI-Driven Code Review

Yuzhi Wang

doi:10.54097/d6775287

Authors

Yuzhi Wang

DOI:

https://doi.org/10.54097/d6775287

Keywords:

AI, LLM, Code Generation, Code Review

Abstract

With the significant breakthroughs of deep learning technologies such as large language models (LLMs) in the field of code analysis, AI has evolved from an auxiliary tool to a key technology that deeply participates in code optimization and resolving performance issues. As modern software system architectures become increasingly complex, the requirements for their performance have also become more stringent. During the coding stage, developers find it difficult to effectively identify and resolve potential performance issues using traditional methods. This review focuses on the application of artificial intelligence in two key areas: AI-assisted intelligent code generation and AI-povered code review. The review systematically analyzed the application of LLMs in software development, revealing a situation where efficiency gains coexist with quality challenges. In terms of code generation, models such as Code Llama and Copilot have significantly accelerated the development process. In the field of code review, AI can effectively handle code standards and low-severity defects. However, in the future, this field still needs to address the issues of the reliability and security of the code generated by LLMs, as well as the insufficient explainability of the results of automated performance analysis. The future research focus in this field lies in addressing issues such as the lack of interpretability and insufficient domain knowledge of LLMs. It is necessary to prioritize enhancing the reliability of AI recommendations and promoting the transformation of AI from an auxiliary tool to an intelligent Agent with self-repair capabilities, in order to achieve a truly efficient and secure human-machine collaboration paradigm. This article systematically reviews the relevant progress, aiming to promote the transformation of software engineering from an artificial-driven model to an AI-enhanced automated paradigm. It provides theoretical references for ensuring the quality of backend code, improving product delivery speed, and enhancing system reliability.

Downloads

Download data is not yet available.

References

[1] Nyaga, F. (2025). AI-Driven Software Engineering: A Systematic Review of Machine Learning’s Impact and Future Directions. Preprints. https://doi.org/10.20944/preprints202504.0174.v1.

[2] Konakanchi, S. (2025). Artificial Intelligence in Code Optimization and Refactoring. Journal of Data and Digital Innovation (JDDI), 2(1), 9-35.

[3] Rao, B. S. M., Bandari, S. S. G., & Nc, R. (2025). Replacing AI Agents for Backend.

[4] Fang, C., Miao, N., Srivastav, S., Liu, J., Zhang, R., Fang, R., ... & Homayoun, H. (2024). Large language models for code analysis: Do {LLMs} really do their job?. In 33rd USENIX Security Symposium (USENIX Security 24) (pp. 829-846).

[5] Konda, R. AI-Powered Code Review Enhancing Software Quality with Intelligent Agents. IJLRP-International Journal of Leading Research Publication, 4(3).

[6] Nam, D., Macvean, A., Hellendoorn, V., Vasilescu, B., & Myers, B. (2024, April). Using an llm to help with code understanding. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (pp. 1-13).

[7] Liu, F., Liu, Y., Shi, L., Huang, H., Wang, R., Yang, Z., ... & Ma, Y. (2024). Exploring and evaluating hallucinations in llm-powered code generation. arXiv preprint arXiv:2404.00971.

[8] Crawford, T., Duong, S., Fueston, R., Lawani, A., Owoade, S., Uzoka, A., ... & Yazdinejad, A. (2023). AI in software engineering: a survey on project management applications. arXiv preprint arXiv: 2307. 15224.

[9] Agha, A. S. (2025). Evaluating AI Efficiency in Backend Software Development-A Comparative Analysis Across Frameworks.

[10] Rasheed, Z., Sami, M. A., Waseem, M., Kemell, K. K., Wang, X., Nguyen, A., ... & Abrahamsson, P. (2024). Ai-powered code review with llms: Early results. arXiv preprint arXiv:2404.18496.

[11] Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. D. O., Kaplan, J., ... & Zaremba, W. (2021). Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.

[12] Li, R., Allal, L. B., Zi, Y., Muennighoff, N., Kocetkov, D., Mou, C., ... & de Vries, H. (2023). Starcoder: may the source be with you!. arXiv preprint arXiv:2305.06161.

[13] Fried, D., Aghajanyan, A., Lin, J., Wang, S., Wallace, E., Shi, F., ... & Lewis, M. (2022). Incoder: A generative model for code infilling and synthesis. arXiv preprint arXiv:2204.05999. p.

[14] Roziere, B., Gehring, J., Gloeckle, F., Sootla, S., Gat, I., Tan, X. E., ... & Synnaeve, G. (2023). Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950.

[15] Yetistiren, B., Ozsoy, I., & Tuzun, E. (2022, November). Assessing the quality of GitHub copilot’s code generation. In Proceedings of the 18th international conference on predictive models and data analytics in software engineering (pp. 62-71).

[16] Dong, Y., Jiang, X., Qian, J., Wang, T., Zhang, K., Jin, Z., & Li, G. (2025). A survey on code generation with llm-based agents. arXiv preprint arXiv:2508.00083.

[17] Yetiştiren, B., Özsoy, I., Ayerdem, M., & Tüzün, E. (2023). Evaluating the code quality of ai-assisted code generation tools: An empirical study on github copilot, amazon codewhisperer, and chatgpt. arXiv preprint arXiv:2304.10778.

[18] Amro, A., & Alalfi, M. H. (2025). GitHub's Copilot Code Review: Can AI Spot Security Flaws Before You Commit?. arXiv preprint arXiv:2509.13650.