Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts

Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Cons...

Full description

Bibliographic Details
Main Authors: Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, Shingo Okamura
Format: Article
Language:English
Published: Elsevier 2022-12-01
Series:Blockchain: Research and Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2096720922000422
_version_ 1811198390018506752
author Nami Ashizawa
Naoto Yanai
Jason Paul Cruz
Shingo Okamura
author_facet Nami Ashizawa
Naoto Yanai
Jason Paul Cruz
Shingo Okamura
author_sort Nami Ashizawa
collection DOAJ
description Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.
first_indexed 2024-04-12T01:30:19Z
format Article
id doaj.art-112ce893f00b4527ad8e22ffc5ea4dc0
institution Directory Open Access Journal
issn 2666-9536
language English
last_indexed 2024-04-12T01:30:19Z
publishDate 2022-12-01
publisher Elsevier
record_format Article
series Blockchain: Research and Applications
spelling doaj.art-112ce893f00b4527ad8e22ffc5ea4dc02022-12-22T03:53:30ZengElsevierBlockchain: Research and Applications2666-95362022-12-0134100101Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contractsNami Ashizawa0Naoto Yanai1Jason Paul Cruz2Shingo Okamura3School of Information Science and Technology, Osaka University, Osaka 565-0871, JapanSchool of Information Science and Technology, Osaka University, Osaka 565-0871, Japan; Corresponding author.Osaka University, Osaka 565-0871, JapanNational Institute of Technology, Nara College, Nara 639-1080, JapanEthereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.http://www.sciencedirect.com/science/article/pii/S2096720922000422EthereumSmart contractsBlockchainNeural networksStatic analysisCode similarity
spellingShingle Nami Ashizawa
Naoto Yanai
Jason Paul Cruz
Shingo Okamura
Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
Blockchain: Research and Applications
Ethereum
Smart contracts
Blockchain
Neural networks
Static analysis
Code similarity
title Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
title_full Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
title_fullStr Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
title_full_unstemmed Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
title_short Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
title_sort eth2vec learning contract wide code representations for vulnerability detection on ethereum smart contracts
topic Ethereum
Smart contracts
Blockchain
Neural networks
Static analysis
Code similarity
url http://www.sciencedirect.com/science/article/pii/S2096720922000422
work_keys_str_mv AT namiashizawa eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts
AT naotoyanai eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts
AT jasonpaulcruz eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts
AT shingookamura eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts