BayesAge: A maximum likelihood algorithm to predict epigenetic age

Introduction: DNA methylation, specifically the formation of 5-methylcytosine at the C5 position of cytosine, undergoes reproducible changes as organisms age, establishing it as a significant biomarker in aging studies. Epigenetic clocks, which integrate methylation patterns to predict age, often em...

Full description

Bibliographic Details
Main Authors: Lajoyce Mboning, Liudmilla Rubbi, Michael Thompson, Louis-S. Bouchard, Matteo Pellegrini
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-04-01
Series:Frontiers in Bioinformatics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fbinf.2024.1329144/full
_version_ 1797224425524822016
author Lajoyce Mboning
Liudmilla Rubbi
Michael Thompson
Louis-S. Bouchard
Matteo Pellegrini
author_facet Lajoyce Mboning
Liudmilla Rubbi
Michael Thompson
Louis-S. Bouchard
Matteo Pellegrini
author_sort Lajoyce Mboning
collection DOAJ
description Introduction: DNA methylation, specifically the formation of 5-methylcytosine at the C5 position of cytosine, undergoes reproducible changes as organisms age, establishing it as a significant biomarker in aging studies. Epigenetic clocks, which integrate methylation patterns to predict age, often employ linear models based on penalized regression, yet they encounter challenges in handling missing data, count-based bisulfite sequence data, and interpretation.Methods: To address these limitations, we introduce BayesAge, an extension of the scAge methodology originally designed for single-cell DNA methylation analysis. BayesAge employs maximum likelihood estimation (MLE) for age inference, models count data using binomial distributions, and incorporates LOWESS smoothing to capture non-linear methylation-age dynamics. This approach is tailored for bulk bisulfite sequencing datasets.Results: BayesAge demonstrates superior performance compared to scAge. Notably, its age residuals exhibit no age association, offering a less biased representation of epigenetic age variation across populations. Furthermore, BayesAge facilitates the estimation of error bounds on age inference. When applied to down-sampled data, BayesAge achieves a higher coefficient of determination between predicted and actual ages compared to both scAge and penalized regression.Discussion: BayesAge presents a promising advancement in epigenetic age prediction, addressing key challenges encountered by existing models. By integrating robust statistical techniques and tailored methodologies for count-based data, BayesAge offers improved accuracy and interpretability in predicting age from bulk bisulfite sequencing datasets. Its ability to estimate error bounds enhances the reliability of age inference, thereby contributing to a more comprehensive understanding of epigenetic aging processes.
first_indexed 2024-04-24T13:52:55Z
format Article
id doaj.art-c665913c2e3a4f788a07405cdbdc8d73
institution Directory Open Access Journal
issn 2673-7647
language English
last_indexed 2024-04-24T13:52:55Z
publishDate 2024-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Bioinformatics
spelling doaj.art-c665913c2e3a4f788a07405cdbdc8d732024-04-04T04:24:01ZengFrontiers Media S.A.Frontiers in Bioinformatics2673-76472024-04-01410.3389/fbinf.2024.13291441329144BayesAge: A maximum likelihood algorithm to predict epigenetic ageLajoyce Mboning0Liudmilla Rubbi1Michael Thompson2Louis-S. Bouchard3Matteo Pellegrini4Department of Chemistry and Biochemistry, University of California Los Angeles, Los Angeles, CA, United StatesDepartment of Molecular, Cell and Developmental Biology, University of Los Angeles, Los Angeles, CA, United StatesDepartment of Molecular, Cell and Developmental Biology, University of Los Angeles, Los Angeles, CA, United StatesDepartment of Chemistry and Biochemistry, University of California Los Angeles, Los Angeles, CA, United StatesDepartment of Molecular, Cell and Developmental Biology, University of Los Angeles, Los Angeles, CA, United StatesIntroduction: DNA methylation, specifically the formation of 5-methylcytosine at the C5 position of cytosine, undergoes reproducible changes as organisms age, establishing it as a significant biomarker in aging studies. Epigenetic clocks, which integrate methylation patterns to predict age, often employ linear models based on penalized regression, yet they encounter challenges in handling missing data, count-based bisulfite sequence data, and interpretation.Methods: To address these limitations, we introduce BayesAge, an extension of the scAge methodology originally designed for single-cell DNA methylation analysis. BayesAge employs maximum likelihood estimation (MLE) for age inference, models count data using binomial distributions, and incorporates LOWESS smoothing to capture non-linear methylation-age dynamics. This approach is tailored for bulk bisulfite sequencing datasets.Results: BayesAge demonstrates superior performance compared to scAge. Notably, its age residuals exhibit no age association, offering a less biased representation of epigenetic age variation across populations. Furthermore, BayesAge facilitates the estimation of error bounds on age inference. When applied to down-sampled data, BayesAge achieves a higher coefficient of determination between predicted and actual ages compared to both scAge and penalized regression.Discussion: BayesAge presents a promising advancement in epigenetic age prediction, addressing key challenges encountered by existing models. By integrating robust statistical techniques and tailored methodologies for count-based data, BayesAge offers improved accuracy and interpretability in predicting age from bulk bisulfite sequencing datasets. Its ability to estimate error bounds enhances the reliability of age inference, thereby contributing to a more comprehensive understanding of epigenetic aging processes.https://www.frontiersin.org/articles/10.3389/fbinf.2024.1329144/fullBayesAgescAgeepigenetic agemaximum likelihood estimationtrue age
spellingShingle Lajoyce Mboning
Liudmilla Rubbi
Michael Thompson
Louis-S. Bouchard
Matteo Pellegrini
BayesAge: A maximum likelihood algorithm to predict epigenetic age
Frontiers in Bioinformatics
BayesAge
scAge
epigenetic age
maximum likelihood estimation
true age
title BayesAge: A maximum likelihood algorithm to predict epigenetic age
title_full BayesAge: A maximum likelihood algorithm to predict epigenetic age
title_fullStr BayesAge: A maximum likelihood algorithm to predict epigenetic age
title_full_unstemmed BayesAge: A maximum likelihood algorithm to predict epigenetic age
title_short BayesAge: A maximum likelihood algorithm to predict epigenetic age
title_sort bayesage a maximum likelihood algorithm to predict epigenetic age
topic BayesAge
scAge
epigenetic age
maximum likelihood estimation
true age
url https://www.frontiersin.org/articles/10.3389/fbinf.2024.1329144/full
work_keys_str_mv AT lajoycemboning bayesageamaximumlikelihoodalgorithmtopredictepigeneticage
AT liudmillarubbi bayesageamaximumlikelihoodalgorithmtopredictepigeneticage
AT michaelthompson bayesageamaximumlikelihoodalgorithmtopredictepigeneticage
AT louissbouchard bayesageamaximumlikelihoodalgorithmtopredictepigeneticage
AT matteopellegrini bayesageamaximumlikelihoodalgorithmtopredictepigeneticage