Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data

Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individu...

Full description

Bibliographic Details
Main Authors: Md. Mohaiminul Islam, Noman Mohammed, Yang Wang, Pingzhao Hu
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-06-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/full
_version_ 1811335185485004800
author Md. Mohaiminul Islam
Noman Mohammed
Yang Wang
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
author_facet Md. Mohaiminul Islam
Noman Mohammed
Yang Wang
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
author_sort Md. Mohaiminul Islam
collection DOAJ
description Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work.
first_indexed 2024-04-13T17:20:25Z
format Article
id doaj.art-64330d6792534c08ac7d66bfe72798b6
institution Directory Open Access Journal
issn 2234-943X
language English
last_indexed 2024-04-13T17:20:25Z
publishDate 2022-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Oncology
spelling doaj.art-64330d6792534c08ac7d66bfe72798b62022-12-22T02:38:00ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2022-06-011210.3389/fonc.2022.879607879607Differential Private Deep Learning Models for Analyzing Breast Cancer Omics DataMd. Mohaiminul Islam0Noman Mohammed1Yang Wang2Pingzhao Hu3Pingzhao Hu4Pingzhao Hu5Pingzhao Hu6Department of Computer Science, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Computer Science, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB, CanadaResearch Institute for Oncology and Hematology, CancerCare Manitoba, Winnipeg, MB, CanadaProper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work.https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/fulldeep learningdifferential privacyRényi differential privacybreast canceromics data
spellingShingle Md. Mohaiminul Islam
Noman Mohammed
Yang Wang
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
Pingzhao Hu
Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
Frontiers in Oncology
deep learning
differential privacy
Rényi differential privacy
breast cancer
omics data
title Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_full Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_fullStr Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_full_unstemmed Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_short Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_sort differential private deep learning models for analyzing breast cancer omics data
topic deep learning
differential privacy
Rényi differential privacy
breast cancer
omics data
url https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/full
work_keys_str_mv AT mdmohaiminulislam differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT nomanmohammed differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT yangwang differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata