Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree

The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant mar...

Full description

Bibliographic Details
Main Authors: Qing Chen, Ji Zhang, Banghe Bao, Fan Zhang, Jie Zhou
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-01-01
Series:Frontiers in Molecular Biosciences
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmolb.2021.815243/full
_version_ 1818755158984622080
author Qing Chen
Ji Zhang
Banghe Bao
Fan Zhang
Jie Zhou
author_facet Qing Chen
Ji Zhang
Banghe Bao
Fan Zhang
Jie Zhou
author_sort Qing Chen
collection DOAJ
description The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.
first_indexed 2024-12-18T05:34:42Z
format Article
id doaj.art-c5089ae9158849218e7026e0936204ad
institution Directory Open Access Journal
issn 2296-889X
language English
last_indexed 2024-12-18T05:34:42Z
publishDate 2022-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Molecular Biosciences
spelling doaj.art-c5089ae9158849218e7026e0936204ad2022-12-21T21:19:21ZengFrontiers Media S.A.Frontiers in Molecular Biosciences2296-889X2022-01-01810.3389/fmolb.2021.815243815243Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision TreeQing Chen0Ji Zhang1Banghe Bao2Fan Zhang3Jie Zhou4Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, ChinaDepartment of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, ChinaDepartment of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, ChinaWuhan Asia General Hospital, Wuhan, ChinaDepartment of Biochemistry and Molecular Biology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, ChinaThe early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.https://www.frontiersin.org/articles/10.3389/fmolb.2021.815243/fullgastric cancersusceptibility genegradient boosting decision tree (GBDT)random walk (RW)gastric cancer-related genes
spellingShingle Qing Chen
Ji Zhang
Banghe Bao
Fan Zhang
Jie Zhou
Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
Frontiers in Molecular Biosciences
gastric cancer
susceptibility gene
gradient boosting decision tree (GBDT)
random walk (RW)
gastric cancer-related genes
title Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
title_full Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
title_fullStr Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
title_full_unstemmed Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
title_short Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree
title_sort large scale gastric cancer susceptibility gene identification based on gradient boosting decision tree
topic gastric cancer
susceptibility gene
gradient boosting decision tree (GBDT)
random walk (RW)
gastric cancer-related genes
url https://www.frontiersin.org/articles/10.3389/fmolb.2021.815243/full
work_keys_str_mv AT qingchen largescalegastriccancersusceptibilitygeneidentificationbasedongradientboostingdecisiontree
AT jizhang largescalegastriccancersusceptibilitygeneidentificationbasedongradientboostingdecisiontree
AT banghebao largescalegastriccancersusceptibilitygeneidentificationbasedongradientboostingdecisiontree
AT fanzhang largescalegastriccancersusceptibilitygeneidentificationbasedongradientboostingdecisiontree
AT jiezhou largescalegastriccancersusceptibilitygeneidentificationbasedongradientboostingdecisiontree