Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts
Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Cons...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-12-01
|
Series: | Blockchain: Research and Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2096720922000422 |
_version_ | 1811198390018506752 |
---|---|
author | Nami Ashizawa Naoto Yanai Jason Paul Cruz Shingo Okamura |
author_facet | Nami Ashizawa Naoto Yanai Jason Paul Cruz Shingo Okamura |
author_sort | Nami Ashizawa |
collection | DOAJ |
description | Ethereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score. |
first_indexed | 2024-04-12T01:30:19Z |
format | Article |
id | doaj.art-112ce893f00b4527ad8e22ffc5ea4dc0 |
institution | Directory Open Access Journal |
issn | 2666-9536 |
language | English |
last_indexed | 2024-04-12T01:30:19Z |
publishDate | 2022-12-01 |
publisher | Elsevier |
record_format | Article |
series | Blockchain: Research and Applications |
spelling | doaj.art-112ce893f00b4527ad8e22ffc5ea4dc02022-12-22T03:53:30ZengElsevierBlockchain: Research and Applications2666-95362022-12-0134100101Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contractsNami Ashizawa0Naoto Yanai1Jason Paul Cruz2Shingo Okamura3School of Information Science and Technology, Osaka University, Osaka 565-0871, JapanSchool of Information Science and Technology, Osaka University, Osaka 565-0871, Japan; Corresponding author.Osaka University, Osaka 565-0871, JapanNational Institute of Technology, Nara College, Nara 639-1080, JapanEthereum smart contracts are computer programs that are deployed and executed on the Ethereum blockchain to enforce agreements among untrusting parties. Being the most prominent platform that supports smart contracts, Ethereum has been targeted by many attacks and plagued by security incidents. Consequently, many smart contract vulnerabilities have been discovered in the past decade. To detect and prevent such vulnerabilities, different security analysis tools, including static and dynamic analysis tools, have been created, but their performance decreases drastically when codes to be analyzed are constantly being rewritten. In this paper, we propose Eth2Vec, a machine-learning-based static analysis tool that detects smart contract vulnerabilities. Eth2Vec maintains its robustness against code rewrites; i.e., it can detect vulnerabilities even in rewritten codes. Other machine-learning-based static analysis tools require features, which analysts create manually, as inputs. In contrast, Eth2Vec uses a neural network for language processing to automatically learn the features of vulnerable contracts. In doing so, Eth2Vec can detect vulnerabilities in smart contracts by comparing the similarities between the codes of a target contract and those of the learned contracts. We performed experiments with existing open databases, such as Etherscan, and Eth2Vec was able to outperform a recent model based on support vector machine in terms of well-known metrics, i.e., precision, recall, and F1-score.http://www.sciencedirect.com/science/article/pii/S2096720922000422EthereumSmart contractsBlockchainNeural networksStatic analysisCode similarity |
spellingShingle | Nami Ashizawa Naoto Yanai Jason Paul Cruz Shingo Okamura Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts Blockchain: Research and Applications Ethereum Smart contracts Blockchain Neural networks Static analysis Code similarity |
title | Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts |
title_full | Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts |
title_fullStr | Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts |
title_full_unstemmed | Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts |
title_short | Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts |
title_sort | eth2vec learning contract wide code representations for vulnerability detection on ethereum smart contracts |
topic | Ethereum Smart contracts Blockchain Neural networks Static analysis Code similarity |
url | http://www.sciencedirect.com/science/article/pii/S2096720922000422 |
work_keys_str_mv | AT namiashizawa eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts AT naotoyanai eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts AT jasonpaulcruz eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts AT shingookamura eth2veclearningcontractwidecoderepresentationsforvulnerabilitydetectiononethereumsmartcontracts |