Thorough statistical analyses of breast cancer co-methylation patterns

Abstract Background Breast cancer is one of the most commonly diagnosed cancers. It is associated with DNA methylation, an epigenetic event with a methyl group added to a cytosine paired with a guanine, i.e., a CG site. The methylation levels of different genes in a genome are correlated in certain...

Full description

Bibliographic Details
Main Authors: Shuying Sun, Jael Dammann, Pierce Lai, Christine Tian
Format: Article
Language:English
Published: BMC 2022-04-01
Series:BMC Genomic Data
Subjects:
Online Access:https://doi.org/10.1186/s12863-022-01046-w
_version_ 1811332506741374976
author Shuying Sun
Jael Dammann
Pierce Lai
Christine Tian
author_facet Shuying Sun
Jael Dammann
Pierce Lai
Christine Tian
author_sort Shuying Sun
collection DOAJ
description Abstract Background Breast cancer is one of the most commonly diagnosed cancers. It is associated with DNA methylation, an epigenetic event with a methyl group added to a cytosine paired with a guanine, i.e., a CG site. The methylation levels of different genes in a genome are correlated in certain ways that affect gene functions. This correlation pattern is known as co-methylation. It is still not clear how different genes co-methylate in the whole genome of breast cancer samples. Previous studies are conducted using relatively small datasets (Illumina 27K data). In this study, we analyze much larger datasets (Illumina 450K data). Results Our key findings are summarized below. First, normal samples have more highly correlated, or co-methylated, CG pairs than tumor samples. Both tumor and normal samples have more than 93% positive co-methylation, but normal samples have significantly more negatively correlated CG sites than tumor samples (6.6% vs. 2.8%). Second, both tumor and normal samples have about 94% of co-methylated CG pairs on different chromosomes, but normal samples have 470 million more CG pairs. Highly co-methylated pairs on the same chromosome tend to be close to each other. Third, a small proportion of CG sites’ co-methylation patterns change dramatically from normal to tumor. The percentage of differentially methylated (DM) sites among them is larger than the overall DM rate. Fourth, certain CG sites are highly correlated with many CG sites. The top 100 of such super-connector CG sites in tumor and normal samples have no overlaps. Fifth, both highly changing sites and super-connector sites’ locations are significantly different from the genome-wide CG sites’ locations. Sixth, chromosome X co-methylation patterns are very different from other chromosomes. Finally, the network analyses of genes associated with several sets of co-methylated CG sites identified above show that tumor and normal samples have different patterns. Conclusions Our findings will provide researchers with a new understanding of co-methylation patterns in breast cancer. Our ability to thoroughly analyze co-methylation of large datasets will allow researchers to study relationships and associations between different genes in breast cancer.
first_indexed 2024-04-13T16:37:57Z
format Article
id doaj.art-520aab9c4fe04b45b13ab36fab81a930
institution Directory Open Access Journal
issn 2730-6844
language English
last_indexed 2024-04-13T16:37:57Z
publishDate 2022-04-01
publisher BMC
record_format Article
series BMC Genomic Data
spelling doaj.art-520aab9c4fe04b45b13ab36fab81a9302022-12-22T02:39:22ZengBMCBMC Genomic Data2730-68442022-04-0123112310.1186/s12863-022-01046-wThorough statistical analyses of breast cancer co-methylation patternsShuying Sun0Jael Dammann1Pierce Lai2Christine Tian3Department of Mathematics, Texas State UniversitySt. Stephen’s Episcopal SchoolMassachusetts Institute of TechnologyLiberal Arts and Science AcademyAbstract Background Breast cancer is one of the most commonly diagnosed cancers. It is associated with DNA methylation, an epigenetic event with a methyl group added to a cytosine paired with a guanine, i.e., a CG site. The methylation levels of different genes in a genome are correlated in certain ways that affect gene functions. This correlation pattern is known as co-methylation. It is still not clear how different genes co-methylate in the whole genome of breast cancer samples. Previous studies are conducted using relatively small datasets (Illumina 27K data). In this study, we analyze much larger datasets (Illumina 450K data). Results Our key findings are summarized below. First, normal samples have more highly correlated, or co-methylated, CG pairs than tumor samples. Both tumor and normal samples have more than 93% positive co-methylation, but normal samples have significantly more negatively correlated CG sites than tumor samples (6.6% vs. 2.8%). Second, both tumor and normal samples have about 94% of co-methylated CG pairs on different chromosomes, but normal samples have 470 million more CG pairs. Highly co-methylated pairs on the same chromosome tend to be close to each other. Third, a small proportion of CG sites’ co-methylation patterns change dramatically from normal to tumor. The percentage of differentially methylated (DM) sites among them is larger than the overall DM rate. Fourth, certain CG sites are highly correlated with many CG sites. The top 100 of such super-connector CG sites in tumor and normal samples have no overlaps. Fifth, both highly changing sites and super-connector sites’ locations are significantly different from the genome-wide CG sites’ locations. Sixth, chromosome X co-methylation patterns are very different from other chromosomes. Finally, the network analyses of genes associated with several sets of co-methylated CG sites identified above show that tumor and normal samples have different patterns. Conclusions Our findings will provide researchers with a new understanding of co-methylation patterns in breast cancer. Our ability to thoroughly analyze co-methylation of large datasets will allow researchers to study relationships and associations between different genes in breast cancer.https://doi.org/10.1186/s12863-022-01046-wBreast cancerCo-methylationCorrelation analysis
spellingShingle Shuying Sun
Jael Dammann
Pierce Lai
Christine Tian
Thorough statistical analyses of breast cancer co-methylation patterns
BMC Genomic Data
Breast cancer
Co-methylation
Correlation analysis
title Thorough statistical analyses of breast cancer co-methylation patterns
title_full Thorough statistical analyses of breast cancer co-methylation patterns
title_fullStr Thorough statistical analyses of breast cancer co-methylation patterns
title_full_unstemmed Thorough statistical analyses of breast cancer co-methylation patterns
title_short Thorough statistical analyses of breast cancer co-methylation patterns
title_sort thorough statistical analyses of breast cancer co methylation patterns
topic Breast cancer
Co-methylation
Correlation analysis
url https://doi.org/10.1186/s12863-022-01046-w
work_keys_str_mv AT shuyingsun thoroughstatisticalanalysesofbreastcancercomethylationpatterns
AT jaeldammann thoroughstatisticalanalysesofbreastcancercomethylationpatterns
AT piercelai thoroughstatisticalanalysesofbreastcancercomethylationpatterns
AT christinetian thoroughstatisticalanalysesofbreastcancercomethylationpatterns