Estimating DNA methylation potential energy landscapes from nanopore sequencing data
Abstract High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisf...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2021-11-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-021-00781-x |
_version_ | 1818754153002827776 |
---|---|
author | Jordi Abante Sandeep Kambhampati Andrew P. Feinberg John Goutsias |
author_facet | Jordi Abante Sandeep Kambhampati Andrew P. Feinberg John Goutsias |
author_sort | Jordi Abante |
collection | DOAJ |
description | Abstract High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisfactory detection performance and hamper downstream analysis. Here we develop a statistical method, CpelNano, for the quantification and analysis of 5mC methylation landscapes using nanopore data. CpelNano takes into account nanopore noise by means of a hidden Markov model (HMM) in which the true but unknown (“hidden”) methylation state is modeled through an Ising probability distribution that is consistent with methylation means and pairwise correlations, whereas nanopore current signals constitute the observed state. It then estimates the associated methylation potential energy function by employing the expectation-maximization (EM) algorithm and performs differential methylation analysis via permutation-based hypothesis testing. Using simulations and analysis of published data obtained from three human cell lines (GM12878, MCF-10A, and MDA-MB-231), we show that CpelNano can faithfully estimate DNA methylation potential energy landscapes, substantially improving current methods and leading to a powerful tool for the modeling and analysis of epigenetic landscapes using nanopore sequencing data. |
first_indexed | 2024-12-18T05:18:43Z |
format | Article |
id | doaj.art-1dd7b0607dc5416e8c94fe3f58dcf75f |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-12-18T05:18:43Z |
publishDate | 2021-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-1dd7b0607dc5416e8c94fe3f58dcf75f2022-12-21T21:19:43ZengNature PortfolioScientific Reports2045-23222021-11-0111111510.1038/s41598-021-00781-xEstimating DNA methylation potential energy landscapes from nanopore sequencing dataJordi Abante0Sandeep Kambhampati1Andrew P. Feinberg2John Goutsias3Whitaker Biomedical Engineering Institute, Johns Hopkins UniversityDepartment of Biomedical Engineering, Johns Hopkins UniversityDepartment of Biomedical Engineering, Johns Hopkins UniversityWhitaker Biomedical Engineering Institute, Johns Hopkins UniversityAbstract High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisfactory detection performance and hamper downstream analysis. Here we develop a statistical method, CpelNano, for the quantification and analysis of 5mC methylation landscapes using nanopore data. CpelNano takes into account nanopore noise by means of a hidden Markov model (HMM) in which the true but unknown (“hidden”) methylation state is modeled through an Ising probability distribution that is consistent with methylation means and pairwise correlations, whereas nanopore current signals constitute the observed state. It then estimates the associated methylation potential energy function by employing the expectation-maximization (EM) algorithm and performs differential methylation analysis via permutation-based hypothesis testing. Using simulations and analysis of published data obtained from three human cell lines (GM12878, MCF-10A, and MDA-MB-231), we show that CpelNano can faithfully estimate DNA methylation potential energy landscapes, substantially improving current methods and leading to a powerful tool for the modeling and analysis of epigenetic landscapes using nanopore sequencing data.https://doi.org/10.1038/s41598-021-00781-x |
spellingShingle | Jordi Abante Sandeep Kambhampati Andrew P. Feinberg John Goutsias Estimating DNA methylation potential energy landscapes from nanopore sequencing data Scientific Reports |
title | Estimating DNA methylation potential energy landscapes from nanopore sequencing data |
title_full | Estimating DNA methylation potential energy landscapes from nanopore sequencing data |
title_fullStr | Estimating DNA methylation potential energy landscapes from nanopore sequencing data |
title_full_unstemmed | Estimating DNA methylation potential energy landscapes from nanopore sequencing data |
title_short | Estimating DNA methylation potential energy landscapes from nanopore sequencing data |
title_sort | estimating dna methylation potential energy landscapes from nanopore sequencing data |
url | https://doi.org/10.1038/s41598-021-00781-x |
work_keys_str_mv | AT jordiabante estimatingdnamethylationpotentialenergylandscapesfromnanoporesequencingdata AT sandeepkambhampati estimatingdnamethylationpotentialenergylandscapesfromnanoporesequencingdata AT andrewpfeinberg estimatingdnamethylationpotentialenergylandscapesfromnanoporesequencingdata AT johngoutsias estimatingdnamethylationpotentialenergylandscapesfromnanoporesequencingdata |