Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limit...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-06-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/full |
_version_ | 1811243225730514944 |
---|---|
author | Liwen Wu Liwen Wu Song Gao Song Gao Shaowen Yao Shaowen Yao Feng Wu Feng Wu Jie Li Jie Li Yunyun Dong Yunyun Dong Yunqi Zhang Yunqi Zhang Yunqi Zhang |
author_facet | Liwen Wu Liwen Wu Song Gao Song Gao Shaowen Yao Shaowen Yao Feng Wu Feng Wu Jie Li Jie Li Yunyun Dong Yunyun Dong Yunqi Zhang Yunqi Zhang Yunqi Zhang |
author_sort | Liwen Wu |
collection | DOAJ |
description | Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization. |
first_indexed | 2024-04-12T14:03:49Z |
format | Article |
id | doaj.art-f254ce0897a24bdd9fbf600c04c92efd |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-12T14:03:49Z |
publishDate | 2022-06-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-f254ce0897a24bdd9fbf600c04c92efd2022-12-22T03:30:08ZengFrontiers Media S.A.Frontiers in Genetics1664-80212022-06-011310.3389/fgene.2022.912614912614Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFMLiwen Wu0Liwen Wu1Song Gao2Song Gao3Shaowen Yao4Shaowen Yao5Feng Wu6Feng Wu7Jie Li8Jie Li9Yunyun Dong10Yunyun Dong11Yunqi Zhang12Yunqi Zhang13Yunqi Zhang14Engineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaEngineering Research Center of Cyberspace, Yunnan University, Kunming, ChinaSchool of Software, Yunnan University, Kunming, ChinaYunnan Key Laboratory of Statistical Modeling and Data Analysis, School of Mathematics and Statistics, Yunnan University, Kunming, ChinaIdentifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization.https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/fullprotein subcellular localizationclass imbalance learningmulti-label classificationgenerative adversarial networksdeep learning |
spellingShingle | Liwen Wu Liwen Wu Song Gao Song Gao Shaowen Yao Shaowen Yao Feng Wu Feng Wu Jie Li Jie Li Yunyun Dong Yunyun Dong Yunqi Zhang Yunqi Zhang Yunqi Zhang Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM Frontiers in Genetics protein subcellular localization class imbalance learning multi-label classification generative adversarial networks deep learning |
title | Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM |
title_full | Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM |
title_fullStr | Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM |
title_full_unstemmed | Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM |
title_short | Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM |
title_sort | gm ploc a subcellular localization model of multi label protein based on gan and deepfm |
topic | protein subcellular localization class imbalance learning multi-label classification generative adversarial networks deep learning |
url | https://www.frontiersin.org/articles/10.3389/fgene.2022.912614/full |
work_keys_str_mv | AT liwenwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT liwenwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT songgao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT songgao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT shaowenyao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT shaowenyao gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT fengwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT fengwu gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT jieli gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT jieli gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT yunyundong gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT yunyundong gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm AT yunqizhang gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm |