Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review

This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augment...

Full description

Bibliographic Details
Main Authors: Man-Fai Wong, Shangxin Guo, Ching-Nam Hang, Siu-Wai Ho, Chee-Wei Tan
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/6/888
_version_ 1797594976955138048
author Man-Fai Wong
Shangxin Guo
Ching-Nam Hang
Siu-Wai Ho
Chee-Wei Tan
author_facet Man-Fai Wong
Shangxin Guo
Ching-Nam Hang
Siu-Wai Ho
Chee-Wei Tan
author_sort Man-Fai Wong
collection DOAJ
description This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection. Notable examples of such applications include the GitHub Copilot powered by OpenAI’s Codex and DeepMind AlphaCode. This paper presents an overview of the major LLMs and their applications in downstream tasks related to AI-assisted programming. Furthermore, it explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications, with a discussion on extending AI-assisted programming capabilities to Apple’s Xcode for mobile software development. This paper also presents the challenges of and opportunities for incorporating NLP techniques with software naturalness, empowering developers with advanced coding assistance and streamlining the software development process.
first_indexed 2024-03-11T02:30:20Z
format Article
id doaj.art-fe8a5d66d6424a0ab58789ef20d2fa01
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-11T02:30:20Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-fe8a5d66d6424a0ab58789ef20d2fa012023-11-18T10:17:50ZengMDPI AGEntropy1099-43002023-06-0125688810.3390/e25060888Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A ReviewMan-Fai Wong0Shangxin Guo1Ching-Nam Hang2Siu-Wai Ho3Chee-Wei Tan4Department of Computer Science, City University of Hong Kong, Hong Kong, ChinaShenzhen Research Institute, City University of Hong Kong, Shenzhen 518057, ChinaDepartment of Computer Science, City University of Hong Kong, Hong Kong, ChinaTeletraffic Research Centre, University of Adelaide, Adelaide, SA 5005, AustraliaSchool of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, SingaporeThis paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection. Notable examples of such applications include the GitHub Copilot powered by OpenAI’s Codex and DeepMind AlphaCode. This paper presents an overview of the major LLMs and their applications in downstream tasks related to AI-assisted programming. Furthermore, it explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications, with a discussion on extending AI-assisted programming capabilities to Apple’s Xcode for mobile software development. This paper also presents the challenges of and opportunities for incorporating NLP techniques with software naturalness, empowering developers with advanced coding assistance and streamlining the software development process.https://www.mdpi.com/1099-4300/25/6/888software naturalnesslarge language modelsAI-assisted programming
spellingShingle Man-Fai Wong
Shangxin Guo
Ching-Nam Hang
Siu-Wai Ho
Chee-Wei Tan
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
Entropy
software naturalness
large language models
AI-assisted programming
title Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
title_full Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
title_fullStr Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
title_full_unstemmed Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
title_short Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
title_sort natural language generation and understanding of big code for ai assisted programming a review
topic software naturalness
large language models
AI-assisted programming
url https://www.mdpi.com/1099-4300/25/6/888
work_keys_str_mv AT manfaiwong naturallanguagegenerationandunderstandingofbigcodeforaiassistedprogrammingareview
AT shangxinguo naturallanguagegenerationandunderstandingofbigcodeforaiassistedprogrammingareview
AT chingnamhang naturallanguagegenerationandunderstandingofbigcodeforaiassistedprogrammingareview
AT siuwaiho naturallanguagegenerationandunderstandingofbigcodeforaiassistedprogrammingareview
AT cheeweitan naturallanguagegenerationandunderstandingofbigcodeforaiassistedprogrammingareview