PretiMeth: precise prediction models for DNA methylation based on single methylation mark

Abstract Background The computational prediction of methylation levels at single CpG resolution is promising to explore the methylation levels of CpGs uncovered by existing array techniques, especially for the 450 K beadchip array data with huge reserves. General prediction models concentrate on imp...

Full description

Bibliographic Details
Main Authors: Jianxiong Tang, Jianxiao Zou, Xiaoran Zhang, Mei Fan, Qi Tian, Shuyao Fu, Shihong Gao, Shicai Fan
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-020-6768-9
_version_ 1819082664036007936
author Jianxiong Tang
Jianxiao Zou
Xiaoran Zhang
Mei Fan
Qi Tian
Shuyao Fu
Shihong Gao
Shicai Fan
author_facet Jianxiong Tang
Jianxiao Zou
Xiaoran Zhang
Mei Fan
Qi Tian
Shuyao Fu
Shihong Gao
Shicai Fan
author_sort Jianxiong Tang
collection DOAJ
description Abstract Background The computational prediction of methylation levels at single CpG resolution is promising to explore the methylation levels of CpGs uncovered by existing array techniques, especially for the 450 K beadchip array data with huge reserves. General prediction models concentrate on improving the overall prediction accuracy for the bulk of CpG loci while neglecting whether each locus is precisely predicted. This leads to the limited application of the prediction results, especially when performing downstream analysis with high precision requirements. Results Here we reported PretiMeth, a method for constructing precise prediction models for each single CpG locus. PretiMeth used a logistic regression algorithm to build a prediction model for each interested locus. Only one DNA methylation feature that shared the most similar methylation pattern with the CpG locus to be predicted was applied in the model. We found that PretiMeth outperformed other algorithms in the prediction accuracy, and kept robust across platforms and cell types. Furthermore, PretiMeth was applied to The Cancer Genome Atlas data (TCGA), the intensive analysis based on precise prediction results showed that several CpG loci and genes (differentially methylated between the tumor and normal samples) were worthy for further biological validation. Conclusion The precise prediction of single CpG locus is important for both methylation array data expansion and downstream analysis of prediction results. PretiMeth achieved precise modeling for each CpG locus by using only one significant feature, which also suggested that our precise prediction models could be probably used for reference in the probe set design when the DNA methylation beadchip update. PretiMeth is provided as an open source tool via https://github.com/JxTang-bioinformatics/PretiMeth .
first_indexed 2024-12-21T20:20:16Z
format Article
id doaj.art-ffc133583eff484a80a50b6267d6c56d
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-21T20:20:16Z
publishDate 2020-05-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-ffc133583eff484a80a50b6267d6c56d2022-12-21T18:51:30ZengBMCBMC Genomics1471-21642020-05-0121111510.1186/s12864-020-6768-9PretiMeth: precise prediction models for DNA methylation based on single methylation markJianxiong Tang0Jianxiao Zou1Xiaoran Zhang2Mei Fan3Qi Tian4Shuyao Fu5Shihong Gao6Shicai Fan7School of Automation Engineering, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaChengdu Women’s and Children’s Central Hospital, School of Medicine, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaSchool of Automation Engineering, University of Electronic Science and Technology of ChinaAbstract Background The computational prediction of methylation levels at single CpG resolution is promising to explore the methylation levels of CpGs uncovered by existing array techniques, especially for the 450 K beadchip array data with huge reserves. General prediction models concentrate on improving the overall prediction accuracy for the bulk of CpG loci while neglecting whether each locus is precisely predicted. This leads to the limited application of the prediction results, especially when performing downstream analysis with high precision requirements. Results Here we reported PretiMeth, a method for constructing precise prediction models for each single CpG locus. PretiMeth used a logistic regression algorithm to build a prediction model for each interested locus. Only one DNA methylation feature that shared the most similar methylation pattern with the CpG locus to be predicted was applied in the model. We found that PretiMeth outperformed other algorithms in the prediction accuracy, and kept robust across platforms and cell types. Furthermore, PretiMeth was applied to The Cancer Genome Atlas data (TCGA), the intensive analysis based on precise prediction results showed that several CpG loci and genes (differentially methylated between the tumor and normal samples) were worthy for further biological validation. Conclusion The precise prediction of single CpG locus is important for both methylation array data expansion and downstream analysis of prediction results. PretiMeth achieved precise modeling for each CpG locus by using only one significant feature, which also suggested that our precise prediction models could be probably used for reference in the probe set design when the DNA methylation beadchip update. PretiMeth is provided as an open source tool via https://github.com/JxTang-bioinformatics/PretiMeth .http://link.springer.com/article/10.1186/s12864-020-6768-9DNA methylationSingle-locus modelingPrecise predictionLogistic regressionTCGADifferential methylation
spellingShingle Jianxiong Tang
Jianxiao Zou
Xiaoran Zhang
Mei Fan
Qi Tian
Shuyao Fu
Shihong Gao
Shicai Fan
PretiMeth: precise prediction models for DNA methylation based on single methylation mark
BMC Genomics
DNA methylation
Single-locus modeling
Precise prediction
Logistic regression
TCGA
Differential methylation
title PretiMeth: precise prediction models for DNA methylation based on single methylation mark
title_full PretiMeth: precise prediction models for DNA methylation based on single methylation mark
title_fullStr PretiMeth: precise prediction models for DNA methylation based on single methylation mark
title_full_unstemmed PretiMeth: precise prediction models for DNA methylation based on single methylation mark
title_short PretiMeth: precise prediction models for DNA methylation based on single methylation mark
title_sort pretimeth precise prediction models for dna methylation based on single methylation mark
topic DNA methylation
Single-locus modeling
Precise prediction
Logistic regression
TCGA
Differential methylation
url http://link.springer.com/article/10.1186/s12864-020-6768-9
work_keys_str_mv AT jianxiongtang pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT jianxiaozou pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT xiaoranzhang pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT meifan pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT qitian pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT shuyaofu pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT shihonggao pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark
AT shicaifan pretimethprecisepredictionmodelsfordnamethylationbasedonsinglemethylationmark