Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM

Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limit...

Full description

Bibliographic Details
Main Authors: Liwen Wu, Song Gao, Shaowen Yao, Feng Wu, Jie Li, Yunyun Dong, Yunqi Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-06-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/full
_version_ 1811243225730514944
author Liwen Wu
Liwen Wu
Song Gao
Song Gao
Shaowen Yao
Shaowen Yao
Feng Wu
Feng Wu
Jie Li
Jie Li
Yunyun Dong
Yunyun Dong
Yunqi Zhang
Yunqi Zhang
Yunqi Zhang
author_facet Liwen Wu
Liwen Wu
Song Gao
Song Gao
Shaowen Yao
Shaowen Yao
Feng Wu
Feng Wu
Jie Li
Jie Li
Yunyun Dong
Yunyun Dong
Yunqi Zhang
Yunqi Zhang
Yunqi Zhang
author_sort Liwen Wu
collection DOAJ
description Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization.
first_indexed 2024-04-12T14:03:49Z
format Article
id doaj.art-f254ce0897a24bdd9fbf600c04c92efd
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-12T14:03:49Z
publishDate 2022-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-f254ce0897a24bdd9fbf600c04c92efd2022-12-22T03:30:08ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-06-011310.3389/fgene.2022.912614912614Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFMLiwen Wu0Liwen Wu1Song Gao2Song Gao3Shaowen Yao4Shaowen Yao5Feng Wu6Feng Wu7Jie Li8Jie Li9Yunyun Dong10Yunyun Dong11Yunqi Zhang12Yunqi Zhang13Yunqi Zhang14Engineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaYunnan Key Laboratory of Statistical Modeling and Data Analysis, School of Mathematics and Statistics, Yunnan University, Kunming, ChinaIdentifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization.https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/fullprotein subcellular localizationclass imbalance learningmulti-label classificationgenerative adversarial networksdeep learning
spellingShingle Liwen Wu
Liwen Wu
Song Gao
Song Gao
Shaowen Yao
Shaowen Yao
Feng Wu
Feng Wu
Jie Li
Jie Li
Yunyun Dong
Yunyun Dong
Yunqi Zhang
Yunqi Zhang
Yunqi Zhang
Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
Frontiers in Genetics
protein subcellular localization
class imbalance learning
multi-label classification
generative adversarial networks
deep learning
title Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_full Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_fullStr Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_full_unstemmed Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_short Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_sort gm ploc a subcellular localization model of multi label protein based on gan and deepfm
topic protein subcellular localization
class imbalance learning
multi-label classification
generative adversarial networks
deep learning
url https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/full
work_keys_str_mv AT liwenwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT liwenwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT songgao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT songgao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT shaowenyao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT shaowenyao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT fengwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT fengwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT jieli gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT jieli gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yunyundong gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yunyundong gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm