A Survey of Automatic Source Code Summarization

Source code summarization refers to the natural language description of the source code’s function. It can help developers easily understand the semantics of the source code. We can think of the source code and the corresponding summarization as being symmetric. However, the existing source code sum...

Full description

Bibliographic Details
Main Authors: Chunyan Zhang, Junchao Wang, Qinglei Zhou, Ting Xu, Ke Tang, Hairen Gui, Fudong Liu
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/14/3/471
_version_ 1797441651369574400
author Chunyan Zhang
Junchao Wang
Qinglei Zhou
Ting Xu
Ke Tang
Hairen Gui
Fudong Liu
author_facet Chunyan Zhang
Junchao Wang
Qinglei Zhou
Ting Xu
Ke Tang
Hairen Gui
Fudong Liu
author_sort Chunyan Zhang
collection DOAJ
description Source code summarization refers to the natural language description of the source code’s function. It can help developers easily understand the semantics of the source code. We can think of the source code and the corresponding summarization as being symmetric. However, the existing source code summarization is mismatched with the source code, missing, or out of date. Manual source code summarization is inefficient and requires a lot of human efforts. To overcome such situations, many studies have been conducted on Automatic Source Code Summarization (ASCS). Given a set of source code, the ASCS techniques can automatically generate a summary described with natural language. In this paper, we give a review of the development of ASCS technology. Almost all ASCS technology involves the following stages: source code modeling, code summarization generation, and quality evaluation. We further categorize the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings. We also draw a clear map on the development of the existing algorithms.
first_indexed 2024-03-09T12:26:14Z
format Article
id doaj.art-e47818284d044a68a0d8df9b3065df28
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-09T12:26:14Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-e47818284d044a68a0d8df9b3065df282023-11-30T22:35:05ZengMDPI AGSymmetry2073-89942022-02-0114347110.3390/sym14030471A Survey of Automatic Source Code SummarizationChunyan Zhang0Junchao Wang1Qinglei Zhou2Ting Xu3Ke Tang4Hairen Gui5Fudong Liu6State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaSchool of Information Engineering, ZhengZhou University, Zhengzhou 450001, ChinaSchool of Information Engineering, ZhengZhou University, Zhengzhou 450001, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaSource code summarization refers to the natural language description of the source code’s function. It can help developers easily understand the semantics of the source code. We can think of the source code and the corresponding summarization as being symmetric. However, the existing source code summarization is mismatched with the source code, missing, or out of date. Manual source code summarization is inefficient and requires a lot of human efforts. To overcome such situations, many studies have been conducted on Automatic Source Code Summarization (ASCS). Given a set of source code, the ASCS techniques can automatically generate a summary described with natural language. In this paper, we give a review of the development of ASCS technology. Almost all ASCS technology involves the following stages: source code modeling, code summarization generation, and quality evaluation. We further categorize the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings. We also draw a clear map on the development of the existing algorithms.https://www.mdpi.com/2073-8994/14/3/471source code summarizationdeep learningprogram analysisneural machine translation
spellingShingle Chunyan Zhang
Junchao Wang
Qinglei Zhou
Ting Xu
Ke Tang
Hairen Gui
Fudong Liu
A Survey of Automatic Source Code Summarization
Symmetry
source code summarization
deep learning
program analysis
neural machine translation
title A Survey of Automatic Source Code Summarization
title_full A Survey of Automatic Source Code Summarization
title_fullStr A Survey of Automatic Source Code Summarization
title_full_unstemmed A Survey of Automatic Source Code Summarization
title_short A Survey of Automatic Source Code Summarization
title_sort survey of automatic source code summarization
topic source code summarization
deep learning
program analysis
neural machine translation
url https://www.mdpi.com/2073-8994/14/3/471
work_keys_str_mv AT chunyanzhang asurveyofautomaticsourcecodesummarization
AT junchaowang asurveyofautomaticsourcecodesummarization
AT qingleizhou asurveyofautomaticsourcecodesummarization
AT tingxu asurveyofautomaticsourcecodesummarization
AT ketang asurveyofautomaticsourcecodesummarization
AT hairengui asurveyofautomaticsourcecodesummarization
AT fudongliu asurveyofautomaticsourcecodesummarization
AT chunyanzhang surveyofautomaticsourcecodesummarization
AT junchaowang surveyofautomaticsourcecodesummarization
AT qingleizhou surveyofautomaticsourcecodesummarization
AT tingxu surveyofautomaticsourcecodesummarization
AT ketang surveyofautomaticsourcecodesummarization
AT hairengui surveyofautomaticsourcecodesummarization
AT fudongliu surveyofautomaticsourcecodesummarization