Pseudocode Generation from Source Code Using the BART Model
In the software development process, more than one developer may work on developing the same program and bugs in the program may be fixed by a different developer; therefore, understanding the source code is an important issue. Pseudocode plays an important role in solving this problem, as it helps...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-10-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/10/21/3967 |
_version_ | 1797467399605190656 |
---|---|
author | Anas Alokla Walaa Gad Waleed Nazih Mustafa Aref Abdel-badeeh Salem |
author_facet | Anas Alokla Walaa Gad Waleed Nazih Mustafa Aref Abdel-badeeh Salem |
author_sort | Anas Alokla |
collection | DOAJ |
description | In the software development process, more than one developer may work on developing the same program and bugs in the program may be fixed by a different developer; therefore, understanding the source code is an important issue. Pseudocode plays an important role in solving this problem, as it helps the developer to understand the source code. Recently, transformer-based pre-trained models achieved remarkable results in machine translation, which is similar to pseudocode generation. In this paper, we propose a novel automatic pseudocode generation from the source code based on a pre-trained Bidirectional and Auto-Regressive Transformer (BART) model. We fine-tuned two pre-trained BART models (i.e., large and base) using a dataset containing source code and its equivalent pseudocode. In addition, two benchmark datasets (i.e., Django and SPoC) were used to evaluate the proposed model. The proposed model based on the BART large model outperforms other state-of-the-art models in terms of BLEU measurement by 15% and 27% for Django and SPoC datasets, respectively. |
first_indexed | 2024-03-09T18:53:05Z |
format | Article |
id | doaj.art-9ec0963c527e44bba2e10e8fcfb5615c |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-09T18:53:05Z |
publishDate | 2022-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-9ec0963c527e44bba2e10e8fcfb5615c2023-11-24T05:42:51ZengMDPI AGMathematics2227-73902022-10-011021396710.3390/math10213967Pseudocode Generation from Source Code Using the BART ModelAnas Alokla0Walaa Gad1Waleed Nazih2Mustafa Aref3Abdel-badeeh Salem4Faculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo 11566, EgyptFaculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo 11566, EgyptCollege of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj 11942, Saudi ArabiaFaculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo 11566, EgyptFaculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo 11566, EgyptIn the software development process, more than one developer may work on developing the same program and bugs in the program may be fixed by a different developer; therefore, understanding the source code is an important issue. Pseudocode plays an important role in solving this problem, as it helps the developer to understand the source code. Recently, transformer-based pre-trained models achieved remarkable results in machine translation, which is similar to pseudocode generation. In this paper, we propose a novel automatic pseudocode generation from the source code based on a pre-trained Bidirectional and Auto-Regressive Transformer (BART) model. We fine-tuned two pre-trained BART models (i.e., large and base) using a dataset containing source code and its equivalent pseudocode. In addition, two benchmark datasets (i.e., Django and SPoC) were used to evaluate the proposed model. The proposed model based on the BART large model outperforms other state-of-the-art models in terms of BLEU measurement by 15% and 27% for Django and SPoC datasets, respectively.https://www.mdpi.com/2227-7390/10/21/3967pseudocode generationBERTGPTBARTnatural language processingneural machine translation |
spellingShingle | Anas Alokla Walaa Gad Waleed Nazih Mustafa Aref Abdel-badeeh Salem Pseudocode Generation from Source Code Using the BART Model Mathematics pseudocode generation BERT GPT BART natural language processing neural machine translation |
title | Pseudocode Generation from Source Code Using the BART Model |
title_full | Pseudocode Generation from Source Code Using the BART Model |
title_fullStr | Pseudocode Generation from Source Code Using the BART Model |
title_full_unstemmed | Pseudocode Generation from Source Code Using the BART Model |
title_short | Pseudocode Generation from Source Code Using the BART Model |
title_sort | pseudocode generation from source code using the bart model |
topic | pseudocode generation BERT GPT BART natural language processing neural machine translation |
url | https://www.mdpi.com/2227-7390/10/21/3967 |
work_keys_str_mv | AT anasalokla pseudocodegenerationfromsourcecodeusingthebartmodel AT walaagad pseudocodegenerationfromsourcecodeusingthebartmodel AT waleednazih pseudocodegenerationfromsourcecodeusingthebartmodel AT mustafaaref pseudocodegenerationfromsourcecodeusingthebartmodel AT abdelbadeehsalem pseudocodegenerationfromsourcecodeusingthebartmodel |