Secure tumor classification by shallow neural network using homomorphic encryption

Abstract Background Disclosure of patients’ genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools...

Full description

Bibliographic Details
Main Authors:	Seungwan Hong, Jai Hyun Park, Wonhee Cho, Hyeongmin Choe, Jung Hee Cheon
Format:	Article
Language:	English
Published:	BMC 2022-04-01
Series:	BMC Genomics
Subjects:	Homomorphic encryption Multi-label classification Privacy Neural network Softmax activation
Online Access:	https://doi.org/10.1186/s12864-022-08469-w

_version_	1819210150644285440
author	Seungwan Hong Jai Hyun Park Wonhee Cho Hyeongmin Choe Jung Hee Cheon
author_facet	Seungwan Hong Jai Hyun Park Wonhee Cho Hyeongmin Choe Jung Hee Cheon
author_sort	Seungwan Hong
collection	DOAJ
description	Abstract Background Disclosure of patients’ genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools to perform such computation without information leakage, but it brings great challenges for directly applying general machine learning algorithms due to the limitations of operations supported by HE. In particular, non-polynomial activation functions, including softmax functions, are difficult to implement with HE and require a suitable approximation method to minimize the loss of accuracy. In the secure genome analysis competition called iDASH 2020, it is presented as a competition task that a multi-label tumor classification method that predicts the class of samples based on genetic information using HE. Methods We develop a secure multi-label tumor classification method using HE to ensure privacy during all the computations of the model inference process. Our solution is based on a 1-layer neural network with the softmax activation function model and uses the approximate HE scheme. We present an approximation method that enables softmax activation in the model using HE and a technique for efficiently encoding data to reduce computational costs. In addition, we propose a HE-friendly data filtering method to reduce the size of large-scale genetic data. Results We aim to analyze the dataset from The Cancer Genome Atlas (TCGA) dataset, which consists of 3,622 samples from 11 types of cancers, genetic features from 25,128 genes. Our preprocessing method reduces the number of genes to 4,096 or less and achieves a microAUC value of 0.9882 (85% accuracy) with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 3.75 minutes. As a result of exceptionally high microAUC values, our solution was awarded co-first place in iDASH 2020 Track 1: “Secure multi-label Tumor classification using Homomorphic Encryption”. Conclusions Our solution is the first result of implementing a neural network model with softmax activation using HE. Also, HE optimization methods presented in this work enable machine learning implementation using HE or other challenging HE applications.
first_indexed	2024-12-23T06:06:36Z
format	Article
id	doaj.art-c86c4799970f403aa2aba50d41e42fef
institution	Directory Open Access Journal
issn	1471-2164
language	English
last_indexed	2024-12-23T06:06:36Z
publishDate	2022-04-01
publisher	BMC
record_format	Article
series	BMC Genomics
spelling	doaj.art-c86c4799970f403aa2aba50d41e42fef2022-12-21T17:57:33ZengBMCBMC Genomics1471-21642022-04-0123111910.1186/s12864-022-08469-wSecure tumor classification by shallow neural network using homomorphic encryptionSeungwan Hong0Jai Hyun Park1Wonhee Cho2Hyeongmin Choe3Jung Hee Cheon4Department of Mathematical Sciences, Seoul National UniversityDepartment of Mathematical Sciences, Seoul National UniversityDepartment of Mathematical Sciences, Seoul National UniversityDepartment of Mathematical Sciences, Seoul National UniversityDepartment of Mathematical Sciences, Seoul National UniversityAbstract Background Disclosure of patients’ genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools to perform such computation without information leakage, but it brings great challenges for directly applying general machine learning algorithms due to the limitations of operations supported by HE. In particular, non-polynomial activation functions, including softmax functions, are difficult to implement with HE and require a suitable approximation method to minimize the loss of accuracy. In the secure genome analysis competition called iDASH 2020, it is presented as a competition task that a multi-label tumor classification method that predicts the class of samples based on genetic information using HE. Methods We develop a secure multi-label tumor classification method using HE to ensure privacy during all the computations of the model inference process. Our solution is based on a 1-layer neural network with the softmax activation function model and uses the approximate HE scheme. We present an approximation method that enables softmax activation in the model using HE and a technique for efficiently encoding data to reduce computational costs. In addition, we propose a HE-friendly data filtering method to reduce the size of large-scale genetic data. Results We aim to analyze the dataset from The Cancer Genome Atlas (TCGA) dataset, which consists of 3,622 samples from 11 types of cancers, genetic features from 25,128 genes. Our preprocessing method reduces the number of genes to 4,096 or less and achieves a microAUC value of 0.9882 (85% accuracy) with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 3.75 minutes. As a result of exceptionally high microAUC values, our solution was awarded co-first place in iDASH 2020 Track 1: “Secure multi-label Tumor classification using Homomorphic Encryption”. Conclusions Our solution is the first result of implementing a neural network model with softmax activation using HE. Also, HE optimization methods presented in this work enable machine learning implementation using HE or other challenging HE applications.https://doi.org/10.1186/s12864-022-08469-wHomomorphic encryptionMulti-label classificationPrivacyNeural networkSoftmax activation
spellingShingle	Seungwan Hong Jai Hyun Park Wonhee Cho Hyeongmin Choe Jung Hee Cheon Secure tumor classification by shallow neural network using homomorphic encryption BMC Genomics Homomorphic encryption Multi-label classification Privacy Neural network Softmax activation
title	Secure tumor classification by shallow neural network using homomorphic encryption
title_full	Secure tumor classification by shallow neural network using homomorphic encryption
title_fullStr	Secure tumor classification by shallow neural network using homomorphic encryption
title_full_unstemmed	Secure tumor classification by shallow neural network using homomorphic encryption
title_short	Secure tumor classification by shallow neural network using homomorphic encryption
title_sort	secure tumor classification by shallow neural network using homomorphic encryption
topic	Homomorphic encryption Multi-label classification Privacy Neural network Softmax activation
url	https://doi.org/10.1186/s12864-022-08469-w
work_keys_str_mv	AT seungwanhong securetumorclassificationbyshallowneuralnetworkusinghomomorphicencryption AT jaihyunpark securetumorclassificationbyshallowneuralnetworkusinghomomorphicencryption AT wonheecho securetumorclassificationbyshallowneuralnetworkusinghomomorphicencryption AT hyeongminchoe securetumorclassificationbyshallowneuralnetworkusinghomomorphicencryption AT jungheecheon securetumorclassificationbyshallowneuralnetworkusinghomomorphicencryption

Secure tumor classification by shallow neural network using homomorphic encryption

Similar Items