The Source Code Comment Generation based on Abstract Syntax Tree

Authors

  • Daoyang Ming
  • Weicheng Xiong

DOI:

https://doi.org/10.54097/fcis.v5i3.13837

Keywords:

Deep Learning, Source Code Comment Generation, Abstract Syntax Tree

Abstract

Code summarization provides the main aim described in natural language of the given function; it can benefit many tasks in software engineering. Due to the special grammar and syntax structure of programming languages and various shortcomings of different deep neural networks, the accuracy of existing code summarization approaches is not good enough. We proposes to use abstract syntax trees for source code summarization .Our solution is inspired by recent advances in neural machine translation, as well as an approach called SBT by Hu et al. We evaluate our approach using the automated metric BLEU and compare it to other relevant models.

Downloads

Download data is not yet available.

References

N. KALCHBRENNER, L. ESPEHOLT, K. SIMONYAN, A. V. D. OORD, AND K. KAVUKCUOGLU, Neural machine translation in linear time, arXiv preprint arXiv:1610.10099, (2016).

A. GRAVES, Generating sequences with recurrent neural networks, arXiv preprint arXiv:1308.0850, (2013).

A. LOUIS, S. K. DASH, E. T. BARR, AND C. SUTTON, Deep learning to detect redundant method comments, arXiv preprint arXiv:1806.04616, (2018).

R. COLLOBERT AND J. WESTON, A unified architecture for natural language processing: Deep neural networks with multitask learning, in Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), 2008, pp. 160–167.

S. IYER, I. KONSTAS, A. CHEUNG, AND L. ZETTLEMOYER, Summarizing source code using a neural attention model., in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (1), 2016, pp. 2073–2083.

M. MALHOTRA AND J. K. CHHABRA, Class level code summarization based on dependencies and micro patterns, in 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), IEEE, 2018, pp. 1011–1016.

A. SEE, P. J. LIU, AND C. D. MANNING, Get to the point: Summarization with pointer-generator networks, in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, July 2017, Association for Computational Linguistics, pp. 1073–1083.

H. SHI, H. ZHOU, J. CHEN, AND L. LI, On tree-based neural sentence modeling, arXiv preprint arXiv:1808.09644, (2018).

K. PAPINENI, S. ROUKOS, T. WARD, AND W. J. ZHU, Bleu: a method for automatic evaluation of machine translation, in Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, 2002, pp. 311–318.K.

M. HAMMAD, A. ABULJADAYEL, AND M. KHALAF, Summarizing services of java packages, Lecture Notes on Software Engineering, 4 (2016), pp. 129–132.

L. MORENO, J. APONTE, G. SRIDHARA, A. MARCUS, L. POLLOCK, AND K. VIJAYSHANKER, Automatic generation of natural language summaries for java classes, in 2013 21st International Conference on Program Comprehension (ICPC), IEEE, 2013, pp. 23–32.

R. COLLOBERT AND J. WESTON, A unified architecture for natural language processing: Deep neural networks with multitask learning, in Machine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 200.

N. KALCHBRENNER, L. ESPEHOLT, K. SIMONYAN, A. V. D. OORD, AND K. KAVUKCUOGLU, Neural machine translation in linear time, arXiv preprint arXiv:1610.10099, (2016).8), 2008, pp. 160–167.

N. J. ABID, N. DRAGAN, M. L. COLLARD, AND J. I. MALETIC, Using stereotypes in the automatic generation of natural language summaries for c++ methods, in 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, 2015, pp. 561–565.

X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin. Deep code comment generation. In Proceedings of the 26th International Conference on Program Comprehension, pages 200–210. ACM, 2018.

K. Chen, J. Wang, L.-C. Chen, H. Gao, W. Xu, and R. Nevatia. Abc-cnn: An attention based convolutional neural network for visual question answering. arXiv preprint arXiv:1511.05960, 2015.

A. Marcus and J. I. Maletic. Recovering documentation-to-source-code traceability links using latent semantic indexing. In 25th IEEE/ACM International Conference on Software Engineering (ICSE’03), pages 125–137, Portland, OR, 2003.

C. Lopes, S. Bajracharya, J. Ossher, and P. Baldi. UCI source code data sets, 2010. URL http://www.ics. uci.edu/$ sim $ lopes/ datasets/.

M. L. Collard, M. J. Decker, and J. I. Maletic. Lightweight transformation and fact extraction with the srcml toolkit. In Source Code Analysis and Manipulation (SCAM), 2011 11th IEEE International Working Conference on, pages 173–184. IEEE, 2011.

S. H. Tan, D. Marinov, L. Tan, and G. T. Leavens. @tcomment: Testing javadoc comments to detect comment-code inconsistencies. In 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation, pages 260–269, April 2012. doi: 10.1109/ICST.2012.106.

A. Louis, S. K. Dash, E. T. Barr, and C. A. Sutton. Deep learning to detect redundant method comments. CoRR, abs/1806.04616, 2018.

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 311–318, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics, Association for Computational Linguistics. doi: 10.3115/1073083.1073135.

Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3156–3164, 2015.

S.-A. Gr¨onroos. Opennmt ensemble decoding, 2019. URL https://github. com/OpenNMT/OpenNMT-py/pull/732.

E. Garmash and C. Monz. Ensemble learning for multi-source neural machine translation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1409–1418, 2016.

Paul W. McBurney and Collin McMillan. 2016. Automatic Source Code Summarization of Context for Java Methods. IEEE Trans. Software Eng. 42, 2 (2016), 103–119. https://doi. org/ 10.1109/TSE.2015.2465386.

Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori L. Pollock, and K. Vijay-Shanker. 2013. Automatic generation of natural language summaries for Java classes. In IEEE 21st International Conference on Program Comprehension, ICPC 2013, San Francisco, CA, USA, 20-21 May, 2013. 23–32. https://doi.org/10. 1109/ ICPC. 2013. 6613 830.

Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori L. Pollock, and K. VijayShanker. 2010. Towards automatically generating summary comments for Java methods. In ASE. ACM, 43–52.

Brian P. Eddy, Jeffrey A. Robinson, Nicholas A. Kraft, and Jeffrey C. Carver. 2013. Evaluating source code summarization techniques: Replication and expansion. In IEEE 21st International Conference on Program Comprehension, ICPC 2013, San Francisco, CA, USA, 20-21 May, 2013. 13–22. https:// doi.org/10.1109/ICPC.2013. 6613829.

Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. In 17th Working Conference on Reverse Engineering, WCRE 2010, 13-16 October 2010, Beverly, MA, USA. 35–44. https://doi.org/ 10.1109/ WCRE.2010.13.

Martin White, Michele Tufano, Christopher Vendome, and Denys Poshyvanyk. 2016. Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016. 87–98. https:// doi. org/ 10.1145/2970276. 2970326.

Edmund Wong, Taiyue Liu, and Lin Tan. 2015. CloCom: Mining existing source code for automatic comment generation. In 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, SANER 2015, Montreal, QC, Canada, March 2-6, 2015. 380–389. https://doi.org/ 10. 1109/ SANER.2015.7081848.

Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th Conference on Program Comprehension.

Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing Source Code using a Neural Attention Model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers. https://www.aclweb. org/anthology/P16-1195/.

Alexander LeClair, Siyuan Jiang, and Collin McMillan. 2019. A neural model for generating natural language summaries of program subroutines. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019. 795–806. https://doi.org/ 10. 1109/ICSE.2019.00087.

Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori L. Pollock, and K. VijayShanker. 2010. Towards automatically generating summary comments for Java methods. In ASE. ACM, 43–52.

Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori L. Pollock, and K. Vijay-Shanker. 2013. Automatic generation of natural language summaries for Java classes. In IEEE 21st International Conference on Program Comprehension, ICPC 2013, San Francisco, CA, USA, 20-21 May, 2013. 23–32. https://doi.org/10. 1109/ ICPC. 2013. 6613830.

Paul W. McBurney and Collin McMillan. 2016. Automatic Source Code Summarization of Context for Java Methods. IEEE Trans. Software Eng. 42, 2 (2016), 103–119. https:// doi.org/ 10.1109/TSE.2015.2465386.

Sonia Haiduc, Jairo Aponte, Laura Moreno, and Andrian Marcus. 2010. On the Use of Automated Text Summarization Techniques for Summarizing Source Code. In 17th Working Conference on Reverse Engineering, WCRE 2010, 13-16 October 2010, Beverly, MA, USA. 35–44. https://doi.org/ 10. 1109/WCRE.2010.13.

Edmund Wong, Jinqiu Yang, and Lin Tan. 2013. Auto Comment: Mining question and answer sites for automatic comment generation. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, November 11-15, 2013. 55–67. https://doi.org/10.1109/ASE. 2013.6693113.

Downloads

Published

14-11-2023

Issue

Section

Articles

How to Cite

Ming, D., & Xiong, W. (2023). The Source Code Comment Generation based on Abstract Syntax Tree. Frontiers in Computing and Intelligent Systems, 5(3), 18-25. https://doi.org/10.54097/fcis.v5i3.13837