Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data

An increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on...

Full description

Bibliographic Details
Main Authors: Bo Wang, Jing Zhang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9000875/
_version_ 1819159026790825984
author Bo Wang
Jing Zhang
author_facet Bo Wang
Jing Zhang
author_sort Bo Wang
collection DOAJ
description An increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on tumor clinical stage data and the expression quantity of lncRNA transcript is constructed. The proposed model is based on unknown human lncRNA-disease associations combining with the clinical stage data. Firstly, the importance of the characteristic variable is calculated by the proposed CVSgC-RF algorithm. Secondly, 95 lncRNAs, which are most closely related to prostate cancer, are calculated from 480 alternative lncRNAs by CASO and CVSe-CS-CF. On the basis of the above 95 lncRNAs, the CSPA-PL algorithm is used to select a further 22 lncRNAs that are most closely related to the tumor clinical stage for prostate cancer. Finally, 22 lncRNAs are used to construct a logistic regression prediction model. Additionally, this method is applied to lung cancer data; 16 lncRNAs are selected to construct a logistic regression prediction model for lung cancer. Experimental results show that the best results for ROC Area, the accuracy and recall rate of the prediction model are achieved by the proposed method for prostate cancer and lung cancer, which provides a promising basis for subsequent prediction studies of lncRNA-disease associations.
first_indexed 2024-12-22T16:34:01Z
format Article
id doaj.art-16d16a85cfb94cb6aa47e55fdb093b30
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T16:34:01Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-16d16a85cfb94cb6aa47e55fdb093b302022-12-21T18:20:00ZengIEEEIEEE Access2169-35362020-01-018350043501710.1109/ACCESS.2020.29746249000875Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage DataBo Wang0https://orcid.org/0000-0002-4983-7288Jing Zhang1College of Computer Science and Technology, Harbin Engineering University, Harbin, ChinaCollege of Computer Science and Technology, Harbin Engineering University, Harbin, ChinaAn increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on tumor clinical stage data and the expression quantity of lncRNA transcript is constructed. The proposed model is based on unknown human lncRNA-disease associations combining with the clinical stage data. Firstly, the importance of the characteristic variable is calculated by the proposed CVSgC-RF algorithm. Secondly, 95 lncRNAs, which are most closely related to prostate cancer, are calculated from 480 alternative lncRNAs by CASO and CVSe-CS-CF. On the basis of the above 95 lncRNAs, the CSPA-PL algorithm is used to select a further 22 lncRNAs that are most closely related to the tumor clinical stage for prostate cancer. Finally, 22 lncRNAs are used to construct a logistic regression prediction model. Additionally, this method is applied to lung cancer data; 16 lncRNAs are selected to construct a logistic regression prediction model for lung cancer. Experimental results show that the best results for ROC Area, the accuracy and recall rate of the prediction model are achieved by the proposed method for prostate cancer and lung cancer, which provides a promising basis for subsequent prediction studies of lncRNA-disease associations.https://ieeexplore.ieee.org/document/9000875/LncRNA-disease associationrandom forestlogistic regression analysisclinical stage data
spellingShingle Bo Wang
Jing Zhang
Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
IEEE Access
LncRNA-disease association
random forest
logistic regression analysis
clinical stage data
title Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
title_full Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
title_fullStr Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
title_full_unstemmed Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
title_short Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
title_sort logistic regression analysis for lncrna disease association prediction based on random forest and clinical stage data
topic LncRNA-disease association
random forest
logistic regression analysis
clinical stage data
url https://ieeexplore.ieee.org/document/9000875/
work_keys_str_mv AT bowang logisticregressionanalysisforlncrnadiseaseassociationpredictionbasedonrandomforestandclinicalstagedata
AT jingzhang logisticregressionanalysisforlncrnadiseaseassociationpredictionbasedonrandomforestandclinicalstagedata