SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network

Code clone detection is an important research topic in the field of software engineering. It is significant in developing software and solving software infringement disputes to discover code clone phenomenon effectively in and between software systems. In practical engineering applications, clone de...

Full description

Bibliographic Details
Main Authors: Bangrui Wan, Shuang Dong, Jianjun Zhou, Ying Qian
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/17/9580
_version_ 1797582841038503936
author Bangrui Wan
Shuang Dong
Jianjun Zhou
Ying Qian
author_facet Bangrui Wan
Shuang Dong
Jianjun Zhou
Ying Qian
author_sort Bangrui Wan
collection DOAJ
description Code clone detection is an important research topic in the field of software engineering. It is significant in developing software and solving software infringement disputes to discover code clone phenomenon effectively in and between software systems. In practical engineering applications, clone detection can usually only be performed on the compiled code due to the unavailability of the source code. Additionally, there is room for improvement in the detection effect of existing methods based on bytecode. Based on the above reasons, this paper proposes a novel code clone detection method for Java bytecode: SJBCD. SJBCD extracts opcode sequences from byte code files, use GloVe to vectorize opcodes, and builds a Siamese neural network based on GRU to perform supervised training. Then the trained network is used to detect code clones. In order to prove the effectiveness of SJBCD, this paper conducts validation experiments using the BigCloneBench dataset and provides a comparative analysis with four other methods. Experimental results show the effectiveness of the SJBCD method.
first_indexed 2024-03-10T23:28:17Z
format Article
id doaj.art-01ad4f5774e04d3e8347d3fbc66cbe6c
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T23:28:17Z
publishDate 2023-08-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-01ad4f5774e04d3e8347d3fbc66cbe6c2023-11-19T07:49:01ZengMDPI AGApplied Sciences2076-34172023-08-011317958010.3390/app13179580SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural NetworkBangrui Wan0Shuang Dong1Jianjun Zhou2Ying Qian3School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaSchool of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaSchool of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaSchool of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaCode clone detection is an important research topic in the field of software engineering. It is significant in developing software and solving software infringement disputes to discover code clone phenomenon effectively in and between software systems. In practical engineering applications, clone detection can usually only be performed on the compiled code due to the unavailability of the source code. Additionally, there is room for improvement in the detection effect of existing methods based on bytecode. Based on the above reasons, this paper proposes a novel code clone detection method for Java bytecode: SJBCD. SJBCD extracts opcode sequences from byte code files, use GloVe to vectorize opcodes, and builds a Siamese neural network based on GRU to perform supervised training. Then the trained network is used to detect code clones. In order to prove the effectiveness of SJBCD, this paper conducts validation experiments using the BigCloneBench dataset and provides a comparative analysis with four other methods. Experimental results show the effectiveness of the SJBCD method.https://www.mdpi.com/2076-3417/13/17/9580code clonecode clone detectionbytecodesiamese neural network
spellingShingle Bangrui Wan
Shuang Dong
Jianjun Zhou
Ying Qian
SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
Applied Sciences
code clone
code clone detection
bytecode
siamese neural network
title SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
title_full SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
title_fullStr SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
title_full_unstemmed SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
title_short SJBCD: A Java Code Clone Detection Method Based on Bytecode Using Siamese Neural Network
title_sort sjbcd a java code clone detection method based on bytecode using siamese neural network
topic code clone
code clone detection
bytecode
siamese neural network
url https://www.mdpi.com/2076-3417/13/17/9580
work_keys_str_mv AT bangruiwan sjbcdajavacodeclonedetectionmethodbasedonbytecodeusingsiameseneuralnetwork
AT shuangdong sjbcdajavacodeclonedetectionmethodbasedonbytecodeusingsiameseneuralnetwork
AT jianjunzhou sjbcdajavacodeclonedetectionmethodbasedonbytecodeusingsiameseneuralnetwork
AT yingqian sjbcdajavacodeclonedetectionmethodbasedonbytecodeusingsiameseneuralnetwork