Pan-cancer analysis of differential DNA methylation patterns

Abstract Background DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer...

Full description

Bibliographic Details
Main Authors: Mai Shi, Stephen Kwok-Wing Tsui, Hao Wu, Yingying Wei
Format: Article
Language:English
Published: BMC 2020-10-01
Series:BMC Medical Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12920-020-00780-3
_version_ 1818963880189100032
author Mai Shi
Stephen Kwok-Wing Tsui
Hao Wu
Yingying Wei
author_facet Mai Shi
Stephen Kwok-Wing Tsui
Hao Wu
Yingying Wei
author_sort Mai Shi
collection DOAJ
description Abstract Background DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer types. However, current pan-cancer analyses call DM separately for each cancer, which suffers from lower statistical power and fails to provide a comprehensive view for patterns across cancers. Methods In this work, we propose a rigorous statistical model, PanDM, to jointly characterize DM patterns across diverse cancer types. PanDM uses the hidden correlations in the combined dataset to improve statistical power through joint modeling. PanDM takes summary statistics from separate analyses as input and performs methylation site clustering, differential methylation detection, and pan-cancer pattern discovery. We demonstrate the favorable performance of PanDM using simulation data. We apply our model to 12 cancer methylome data collected from The Cancer Genome Atlas (TCGA) project. We further conduct ontology- and pathway-enrichment analyses to gain new biological insights into the pan-cancer DM patterns learned by PanDM. Results PanDM outperforms two types of separate analyses in the power of DM calling in the simulation study. Application of PanDM to TCGA data reveals 37 pan-cancer DM patterns in the 12 cancer methylomes, including both common and cancer-type-specific patterns. These 37 patterns are in turn used to group cancer types. Functional ontology and biological pathways enriched in the non-common patterns not only underpin the cancer-type-specific etiology and pathogenesis but also unveil the common environmental risk factors shared by multiple cancer types. Moreover, we also identify PanDM-specific DM CpG sites that the common strategy fails to detect. Conclusions PanDM is a powerful tool that provides a systematic way to investigate aberrant methylation patterns across multiple cancer types. Results from real data analyses suggest a novel angle for us to understand the common and specific DM patterns in different cancers. Moreover, as PanDM works on the summary statistics for each cancer type, the same framework can in principle be applied to pan-cancer analyses of other functional genomic profiles. We implement PanDM as an R package, which is freely available at http://www.sta.cuhk.edu.hk/YWei/PanDM.html .
first_indexed 2024-12-20T12:52:14Z
format Article
id doaj.art-3f80b9d8f52240e2bc6b9aa1f358bd85
institution Directory Open Access Journal
issn 1755-8794
language English
last_indexed 2024-12-20T12:52:14Z
publishDate 2020-10-01
publisher BMC
record_format Article
series BMC Medical Genomics
spelling doaj.art-3f80b9d8f52240e2bc6b9aa1f358bd852022-12-21T19:40:09ZengBMCBMC Medical Genomics1755-87942020-10-0113S1011310.1186/s12920-020-00780-3Pan-cancer analysis of differential DNA methylation patternsMai Shi0Stephen Kwok-Wing Tsui1Hao Wu2Yingying Wei3School of Biomedical Sciences, The Chinese University of Hong KongSchool of Biomedical Sciences, The Chinese University of Hong KongDepartment of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory UniversityDepartment of Statistics, The Chinese University of Hong KongAbstract Background DNA methylation is a key epigenetic regulator contributing to cancer development. To understand the role of DNA methylation in tumorigenesis, it is important to investigate and compare differential methylation (DM) patterns between normal and case samples across different cancer types. However, current pan-cancer analyses call DM separately for each cancer, which suffers from lower statistical power and fails to provide a comprehensive view for patterns across cancers. Methods In this work, we propose a rigorous statistical model, PanDM, to jointly characterize DM patterns across diverse cancer types. PanDM uses the hidden correlations in the combined dataset to improve statistical power through joint modeling. PanDM takes summary statistics from separate analyses as input and performs methylation site clustering, differential methylation detection, and pan-cancer pattern discovery. We demonstrate the favorable performance of PanDM using simulation data. We apply our model to 12 cancer methylome data collected from The Cancer Genome Atlas (TCGA) project. We further conduct ontology- and pathway-enrichment analyses to gain new biological insights into the pan-cancer DM patterns learned by PanDM. Results PanDM outperforms two types of separate analyses in the power of DM calling in the simulation study. Application of PanDM to TCGA data reveals 37 pan-cancer DM patterns in the 12 cancer methylomes, including both common and cancer-type-specific patterns. These 37 patterns are in turn used to group cancer types. Functional ontology and biological pathways enriched in the non-common patterns not only underpin the cancer-type-specific etiology and pathogenesis but also unveil the common environmental risk factors shared by multiple cancer types. Moreover, we also identify PanDM-specific DM CpG sites that the common strategy fails to detect. Conclusions PanDM is a powerful tool that provides a systematic way to investigate aberrant methylation patterns across multiple cancer types. Results from real data analyses suggest a novel angle for us to understand the common and specific DM patterns in different cancers. Moreover, as PanDM works on the summary statistics for each cancer type, the same framework can in principle be applied to pan-cancer analyses of other functional genomic profiles. We implement PanDM as an R package, which is freely available at http://www.sta.cuhk.edu.hk/YWei/PanDM.html .http://link.springer.com/article/10.1186/s12920-020-00780-3DNA methylationDifferential methylationPan-cancerCancer epigenomics
spellingShingle Mai Shi
Stephen Kwok-Wing Tsui
Hao Wu
Yingying Wei
Pan-cancer analysis of differential DNA methylation patterns
BMC Medical Genomics
DNA methylation
Differential methylation
Pan-cancer
Cancer epigenomics
title Pan-cancer analysis of differential DNA methylation patterns
title_full Pan-cancer analysis of differential DNA methylation patterns
title_fullStr Pan-cancer analysis of differential DNA methylation patterns
title_full_unstemmed Pan-cancer analysis of differential DNA methylation patterns
title_short Pan-cancer analysis of differential DNA methylation patterns
title_sort pan cancer analysis of differential dna methylation patterns
topic DNA methylation
Differential methylation
Pan-cancer
Cancer epigenomics
url http://link.springer.com/article/10.1186/s12920-020-00780-3
work_keys_str_mv AT maishi pancanceranalysisofdifferentialdnamethylationpatterns
AT stephenkwokwingtsui pancanceranalysisofdifferentialdnamethylationpatterns
AT haowu pancanceranalysisofdifferentialdnamethylationpatterns
AT yingyingwei pancanceranalysisofdifferentialdnamethylationpatterns