EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations

Abstract Deep neural networks (DNNs) hold promise for functional genomics prediction, but their generalization capability may be limited by the amount of available data. To address this, we propose EvoAug, a suite of evolution-inspired augmentations that enhance the training of genomic DNNs by incre...

Full description

Bibliographic Details
Main Authors: Nicholas Keone Lee, Ziqi Tang, Shushan Toneyan, Peter K. Koo
Format: Article
Language:English
Published: BMC 2023-05-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-02941-w
_version_ 1797832130427879424
author Nicholas Keone Lee
Ziqi Tang
Shushan Toneyan
Peter K. Koo
author_facet Nicholas Keone Lee
Ziqi Tang
Shushan Toneyan
Peter K. Koo
author_sort Nicholas Keone Lee
collection DOAJ
description Abstract Deep neural networks (DNNs) hold promise for functional genomics prediction, but their generalization capability may be limited by the amount of available data. To address this, we propose EvoAug, a suite of evolution-inspired augmentations that enhance the training of genomic DNNs by increasing genetic variation. Random transformation of DNA sequences can potentially alter their function in unknown ways, so we employ a fine-tuning procedure using the original non-transformed data to preserve functional integrity. Our results demonstrate that EvoAug substantially improves the generalization and interpretability of established DNNs across prominent regulatory genomics prediction tasks, offering a robust solution for genomic DNNs.
first_indexed 2024-04-09T14:02:50Z
format Article
id doaj.art-fdf641cadc6b46828d142ef24c8d0c0d
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-04-09T14:02:50Z
publishDate 2023-05-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-fdf641cadc6b46828d142ef24c8d0c0d2023-05-07T11:14:46ZengBMCGenome Biology1474-760X2023-05-0124111410.1186/s13059-023-02941-wEvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentationsNicholas Keone Lee0Ziqi Tang1Shushan Toneyan2Peter K. Koo3Simons Center for Quantitative Biology, Cold Spring Harbor LaboratorySimons Center for Quantitative Biology, Cold Spring Harbor LaboratorySimons Center for Quantitative Biology, Cold Spring Harbor LaboratorySimons Center for Quantitative Biology, Cold Spring Harbor LaboratoryAbstract Deep neural networks (DNNs) hold promise for functional genomics prediction, but their generalization capability may be limited by the amount of available data. To address this, we propose EvoAug, a suite of evolution-inspired augmentations that enhance the training of genomic DNNs by increasing genetic variation. Random transformation of DNA sequences can potentially alter their function in unknown ways, so we employ a fine-tuning procedure using the original non-transformed data to preserve functional integrity. Our results demonstrate that EvoAug substantially improves the generalization and interpretability of established DNNs across prominent regulatory genomics prediction tasks, offering a robust solution for genomic DNNs.https://doi.org/10.1186/s13059-023-02941-wDeep learningRegulatory genomicsData augmentationsModel interpretability
spellingShingle Nicholas Keone Lee
Ziqi Tang
Shushan Toneyan
Peter K. Koo
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
Genome Biology
Deep learning
Regulatory genomics
Data augmentations
Model interpretability
title EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
title_full EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
title_fullStr EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
title_full_unstemmed EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
title_short EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations
title_sort evoaug improving generalization and interpretability of genomic deep neural networks with evolution inspired data augmentations
topic Deep learning
Regulatory genomics
Data augmentations
Model interpretability
url https://doi.org/10.1186/s13059-023-02941-w
work_keys_str_mv AT nicholaskeonelee evoaugimprovinggeneralizationandinterpretabilityofgenomicdeepneuralnetworkswithevolutioninspireddataaugmentations
AT ziqitang evoaugimprovinggeneralizationandinterpretabilityofgenomicdeepneuralnetworkswithevolutioninspireddataaugmentations
AT shushantoneyan evoaugimprovinggeneralizationandinterpretabilityofgenomicdeepneuralnetworkswithevolutioninspireddataaugmentations
AT peterkkoo evoaugimprovinggeneralizationandinterpretabilityofgenomicdeepneuralnetworkswithevolutioninspireddataaugmentations