Predicting liver cancer on epigenomics data using machine learning

Epigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the canc...

Full description

Bibliographic Details
Main Authors:	Vishalkumar Vekariya, Kalpdrum Passi, Chakresh Kumar Jain
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-09-01
Series:	Frontiers in Bioinformatics
Subjects:	epigenomics histone DNA methylation human genome RNA
Online Access:	https://www.frontiersin.org/articles/10.3389/fbinf.2022.954529/full

_version_	1818032762978828288
author	Vishalkumar Vekariya Kalpdrum Passi Chakresh Kumar Jain
author_facet	Vishalkumar Vekariya Kalpdrum Passi Chakresh Kumar Jain
author_sort	Vishalkumar Vekariya
collection	DOAJ
description	Epigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the cancer cells, which is the only cause of cancer. The liver is the metabolic cleansing center of the human body and the only organ, which can regenerate itself, but liver cancer can stop the cleansing of the body. Machine learning techniques are used in this research to predict the gene expression of the liver cells for the liver hepatocellular carcinoma (LIHC), which is the third biggest reason of death by cancer and affects five hundred thousand people per year. The data for LIHC include four different types, namely, methylation, histone, the human genome, and RNA sequences. The data were accessed through open-source technologies in R programming languages for The Cancer Genome Atlas (TCGA). The proposed method considers 1,000 features across the four types of data. Nine different feature selection methods were used and eight different classification methods were compared to select the best model over 5-fold cross-validation and different training-to-test ratios. The best model was obtained for 140 features for ReliefF feature selection and XGBoost classification method with an AUC of 1.0 and an accuracy of 99.67% to predict the liver cancer.
first_indexed	2024-12-10T06:12:32Z
format	Article
id	doaj.art-7d04d3c1667f4262bf1a1c892d85975d
institution	Directory Open Access Journal
issn	2673-7647
language	English
last_indexed	2024-12-10T06:12:32Z
publishDate	2022-09-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Bioinformatics
spelling	doaj.art-7d04d3c1667f4262bf1a1c892d85975d2022-12-22T01:59:32ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472022-09-01210.3389/fbinf.2022.954529954529Predicting liver cancer on epigenomics data using machine learningVishalkumar Vekariya0Kalpdrum Passi1Chakresh Kumar Jain2School of Engineering and Computer Science, Laurentian University, Sudbury, ON, CanadaSchool of Engineering and Computer Science, Laurentian University, Sudbury, ON, CanadaDepartment of Biotechnology, Jaypee Institute of Information Technology, Noida, IndiaEpigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the cancer cells, which is the only cause of cancer. The liver is the metabolic cleansing center of the human body and the only organ, which can regenerate itself, but liver cancer can stop the cleansing of the body. Machine learning techniques are used in this research to predict the gene expression of the liver cells for the liver hepatocellular carcinoma (LIHC), which is the third biggest reason of death by cancer and affects five hundred thousand people per year. The data for LIHC include four different types, namely, methylation, histone, the human genome, and RNA sequences. The data were accessed through open-source technologies in R programming languages for The Cancer Genome Atlas (TCGA). The proposed method considers 1,000 features across the four types of data. Nine different feature selection methods were used and eight different classification methods were compared to select the best model over 5-fold cross-validation and different training-to-test ratios. The best model was obtained for 140 features for ReliefF feature selection and XGBoost classification method with an AUC of 1.0 and an accuracy of 99.67% to predict the liver cancer.https://www.frontiersin.org/articles/10.3389/fbinf.2022.954529/fullepigenomicshistoneDNA methylationhuman genomeRNA
spellingShingle	Vishalkumar Vekariya Kalpdrum Passi Chakresh Kumar Jain Predicting liver cancer on epigenomics data using machine learning Frontiers in Bioinformatics epigenomics histone DNA methylation human genome RNA
title	Predicting liver cancer on epigenomics data using machine learning
title_full	Predicting liver cancer on epigenomics data using machine learning
title_fullStr	Predicting liver cancer on epigenomics data using machine learning
title_full_unstemmed	Predicting liver cancer on epigenomics data using machine learning
title_short	Predicting liver cancer on epigenomics data using machine learning
title_sort	predicting liver cancer on epigenomics data using machine learning
topic	epigenomics histone DNA methylation human genome RNA
url	https://www.frontiersin.org/articles/10.3389/fbinf.2022.954529/full
work_keys_str_mv	AT vishalkumarvekariya predictinglivercanceronepigenomicsdatausingmachinelearning AT kalpdrumpassi predictinglivercanceronepigenomicsdatausingmachinelearning AT chakreshkumarjain predictinglivercanceronepigenomicsdatausingmachinelearning

Predicting liver cancer on epigenomics data using machine learning

Similar Items