Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data
An increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9000875/ |
_version_ | 1819159026790825984 |
---|---|
author | Bo Wang Jing Zhang |
author_facet | Bo Wang Jing Zhang |
author_sort | Bo Wang |
collection | DOAJ |
description | An increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on tumor clinical stage data and the expression quantity of lncRNA transcript is constructed. The proposed model is based on unknown human lncRNA-disease associations combining with the clinical stage data. Firstly, the importance of the characteristic variable is calculated by the proposed CVSgC-RF algorithm. Secondly, 95 lncRNAs, which are most closely related to prostate cancer, are calculated from 480 alternative lncRNAs by CASO and CVSe-CS-CF. On the basis of the above 95 lncRNAs, the CSPA-PL algorithm is used to select a further 22 lncRNAs that are most closely related to the tumor clinical stage for prostate cancer. Finally, 22 lncRNAs are used to construct a logistic regression prediction model. Additionally, this method is applied to lung cancer data; 16 lncRNAs are selected to construct a logistic regression prediction model for lung cancer. Experimental results show that the best results for ROC Area, the accuracy and recall rate of the prediction model are achieved by the proposed method for prostate cancer and lung cancer, which provides a promising basis for subsequent prediction studies of lncRNA-disease associations. |
first_indexed | 2024-12-22T16:34:01Z |
format | Article |
id | doaj.art-16d16a85cfb94cb6aa47e55fdb093b30 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T16:34:01Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-16d16a85cfb94cb6aa47e55fdb093b302022-12-21T18:20:00ZengIEEEIEEE Access2169-35362020-01-018350043501710.1109/ACCESS.2020.29746249000875Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage DataBo Wang0https://orcid.org/0000-0002-4983-7288Jing Zhang1College of Computer Science and Technology, Harbin Engineering University, Harbin, ChinaCollege of Computer Science and Technology, Harbin Engineering University, Harbin, ChinaAn increasing amount of studies have found that LncRNA plays an important role in various life processes of the body. In current prediction research on lncRNA-disease associations, correlation analysis of disease prognosis is overlooked. In this study, a logistic regression prediction model based on tumor clinical stage data and the expression quantity of lncRNA transcript is constructed. The proposed model is based on unknown human lncRNA-disease associations combining with the clinical stage data. Firstly, the importance of the characteristic variable is calculated by the proposed CVSgC-RF algorithm. Secondly, 95 lncRNAs, which are most closely related to prostate cancer, are calculated from 480 alternative lncRNAs by CASO and CVSe-CS-CF. On the basis of the above 95 lncRNAs, the CSPA-PL algorithm is used to select a further 22 lncRNAs that are most closely related to the tumor clinical stage for prostate cancer. Finally, 22 lncRNAs are used to construct a logistic regression prediction model. Additionally, this method is applied to lung cancer data; 16 lncRNAs are selected to construct a logistic regression prediction model for lung cancer. Experimental results show that the best results for ROC Area, the accuracy and recall rate of the prediction model are achieved by the proposed method for prostate cancer and lung cancer, which provides a promising basis for subsequent prediction studies of lncRNA-disease associations.https://ieeexplore.ieee.org/document/9000875/LncRNA-disease associationrandom forestlogistic regression analysisclinical stage data |
spellingShingle | Bo Wang Jing Zhang Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data IEEE Access LncRNA-disease association random forest logistic regression analysis clinical stage data |
title | Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data |
title_full | Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data |
title_fullStr | Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data |
title_full_unstemmed | Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data |
title_short | Logistic Regression Analysis for LncRNA-Disease Association Prediction Based on Random Forest and Clinical Stage Data |
title_sort | logistic regression analysis for lncrna disease association prediction based on random forest and clinical stage data |
topic | LncRNA-disease association random forest logistic regression analysis clinical stage data |
url | https://ieeexplore.ieee.org/document/9000875/ |
work_keys_str_mv | AT bowang logisticregressionanalysisforlncrnadiseaseassociationpredictionbasedonrandomforestandclinicalstagedata AT jingzhang logisticregressionanalysisforlncrnadiseaseassociationpredictionbasedonrandomforestandclinicalstagedata |