Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individu...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-06-01
|
Series: | Frontiers in Oncology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/full |
_version_ | 1811335185485004800 |
---|---|
author | Md. Mohaiminul Islam Noman Mohammed Yang Wang Pingzhao Hu Pingzhao Hu Pingzhao Hu Pingzhao Hu |
author_facet | Md. Mohaiminul Islam Noman Mohammed Yang Wang Pingzhao Hu Pingzhao Hu Pingzhao Hu Pingzhao Hu |
author_sort | Md. Mohaiminul Islam |
collection | DOAJ |
description | Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work. |
first_indexed | 2024-04-13T17:20:25Z |
format | Article |
id | doaj.art-64330d6792534c08ac7d66bfe72798b6 |
institution | Directory Open Access Journal |
issn | 2234-943X |
language | English |
last_indexed | 2024-04-13T17:20:25Z |
publishDate | 2022-06-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Oncology |
spelling | doaj.art-64330d6792534c08ac7d66bfe72798b62022-12-22T02:38:00ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2022-06-011210.3389/fonc.2022.879607879607Differential Private Deep Learning Models for Analyzing Breast Cancer Omics DataMd. Mohaiminul Islam0Noman Mohammed1Yang Wang2Pingzhao Hu3Pingzhao Hu4Pingzhao Hu5Pingzhao Hu6Department of Computer Science, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Computer Science, University of Manitoba, Winnipeg, MB, CanadaDepartment of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, CanadaDepartment of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB, CanadaResearch Institute for Oncology and Hematology, CancerCare Manitoba, Winnipeg, MB, CanadaProper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work.https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/fulldeep learningdifferential privacyRényi differential privacybreast canceromics data |
spellingShingle | Md. Mohaiminul Islam Noman Mohammed Yang Wang Pingzhao Hu Pingzhao Hu Pingzhao Hu Pingzhao Hu Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data Frontiers in Oncology deep learning differential privacy Rényi differential privacy breast cancer omics data |
title | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_full | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_fullStr | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_full_unstemmed | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_short | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_sort | differential private deep learning models for analyzing breast cancer omics data |
topic | deep learning differential privacy Rényi differential privacy breast cancer omics data |
url | https://www.frontiersin.org/articles/10.3389/fonc.2022.879607/full |
work_keys_str_mv | AT mdmohaiminulislam differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT nomanmohammed differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT yangwang differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT pingzhaohu differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata |