Employing Source Code Quality Analytics for Enriching Code Snippets Data

The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms o...

Full description

Bibliographic Details
Main Authors: Thomas Karanikiotis, Themistoklis Diamantopoulos, Andreas Symeonidis
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/8/9/140
_version_ 1827726580333412352
author Thomas Karanikiotis
Themistoklis Diamantopoulos
Andreas Symeonidis
author_facet Thomas Karanikiotis
Themistoklis Diamantopoulos
Andreas Symeonidis
author_sort Thomas Karanikiotis
collection DOAJ
description The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.
first_indexed 2024-03-10T22:52:54Z
format Article
id doaj.art-16ba056a972541ab8111adcb838de8c2
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-03-10T22:52:54Z
publishDate 2023-08-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-16ba056a972541ab8111adcb838de8c22023-11-19T10:11:40ZengMDPI AGData2306-57292023-08-018914010.3390/data8090140Employing Source Code Quality Analytics for Enriching Code Snippets DataThomas Karanikiotis0Themistoklis Diamantopoulos1Andreas Symeonidis2Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, GreeceElectrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, GreeceElectrical and Computer Engineering Department, Aristotle University of Thessaloniki, 541 24 Thessaloniki, GreeceThe availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.https://www.mdpi.com/2306-5729/8/9/140mining software repositoriessource code miningreadabilitystatic analysis metricscode snippets
spellingShingle Thomas Karanikiotis
Themistoklis Diamantopoulos
Andreas Symeonidis
Employing Source Code Quality Analytics for Enriching Code Snippets Data
Data
mining software repositories
source code mining
readability
static analysis metrics
code snippets
title Employing Source Code Quality Analytics for Enriching Code Snippets Data
title_full Employing Source Code Quality Analytics for Enriching Code Snippets Data
title_fullStr Employing Source Code Quality Analytics for Enriching Code Snippets Data
title_full_unstemmed Employing Source Code Quality Analytics for Enriching Code Snippets Data
title_short Employing Source Code Quality Analytics for Enriching Code Snippets Data
title_sort employing source code quality analytics for enriching code snippets data
topic mining software repositories
source code mining
readability
static analysis metrics
code snippets
url https://www.mdpi.com/2306-5729/8/9/140
work_keys_str_mv AT thomaskaranikiotis employingsourcecodequalityanalyticsforenrichingcodesnippetsdata
AT themistoklisdiamantopoulos employingsourcecodequalityanalyticsforenrichingcodesnippetsdata
AT andreassymeonidis employingsourcecodequalityanalyticsforenrichingcodesnippetsdata