Development and application of computational methods to study DNA modifications

<p>The epigenetic modifications of DNA shape cell fate in development, differentiation, and disease. The existing gold standard sequencing technologies for epigenetic DNA modifications are based on sodium bisulfite, which is a harsh chemical treatment resulting in DNA degradation. A novel bisu...

Full description

Bibliographic Details
Main Author: Velikova, GV
Other Authors: Schuster-Böckler, B
Format: Thesis
Language:English
Published: 2020
Subjects:
_version_ 1797084673642332160
author Velikova, GV
author2 Schuster-Böckler, B
author_facet Schuster-Böckler, B
Velikova, GV
author_sort Velikova, GV
collection OXFORD
description <p>The epigenetic modifications of DNA shape cell fate in development, differentiation, and disease. The existing gold standard sequencing technologies for epigenetic DNA modifications are based on sodium bisulfite, which is a harsh chemical treatment resulting in DNA degradation. A novel bisulfite-free and base-resolution sequencing method, TET Assisted Pyridine-borane Sequencing (TAPS) was developed to detect the most abundant DNA modifications. In comparison to bisulfite sequencing (BS), TAPS relies on mild reactions for the detection of modified bases. From a bioinformatics perspective, sodium bisulfite substantially reduces information content, complicating data processing and the detection of genetic variation. In fact, most existing modification calling tools for bisulfite-treated data do not distinguish between modifications and genetic variants, which results in false positives. A computational tool, asTair, was created to process DNA modification sequencing data. It was designed primarily for handling TAPS sequencing output, but also contains functions that are useful for bisulfite sequencing data analyses. It was shown that TAPS has more even coverage than BS while having a comparable conversion rate over CpGs, and is applicable to low input samples. A Deep Neural Network (DNN) model that detects single-nucleotide variants in TAPS- and BS-converted sequencing data was created to enable sensitive modification and variant calling. The algorithm showed precision and recall above 0.9 for classifying variants, modifications and reference positions. The model outperformed available variant callers for whole-genome sequencing and BS data. Applying such a model on real datasets could improve the accuracy of identifying real DNA modifications masked by genetic variation and errors, as around a sixth of all SNPs could be misclassified as modifications.</p>
first_indexed 2024-03-07T01:58:19Z
format Thesis
id oxford-uuid:9c7e449a-f0c1-439d-82c6-1b64e52d2dd6
institution University of Oxford
language English
last_indexed 2024-03-07T01:58:19Z
publishDate 2020
record_format dspace
spelling oxford-uuid:9c7e449a-f0c1-439d-82c6-1b64e52d2dd62022-03-27T00:36:23ZDevelopment and application of computational methods to study DNA modificationsThesishttp://purl.org/coar/resource_type/c_db06uuid:9c7e449a-f0c1-439d-82c6-1b64e52d2dd6BioinformaticsData ScienceMethod DevelopmentBiomedical SciencesEnglishHyrax Deposit2020Velikova, GVSchuster-Böckler, BBoccellato, F<p>The epigenetic modifications of DNA shape cell fate in development, differentiation, and disease. The existing gold standard sequencing technologies for epigenetic DNA modifications are based on sodium bisulfite, which is a harsh chemical treatment resulting in DNA degradation. A novel bisulfite-free and base-resolution sequencing method, TET Assisted Pyridine-borane Sequencing (TAPS) was developed to detect the most abundant DNA modifications. In comparison to bisulfite sequencing (BS), TAPS relies on mild reactions for the detection of modified bases. From a bioinformatics perspective, sodium bisulfite substantially reduces information content, complicating data processing and the detection of genetic variation. In fact, most existing modification calling tools for bisulfite-treated data do not distinguish between modifications and genetic variants, which results in false positives. A computational tool, asTair, was created to process DNA modification sequencing data. It was designed primarily for handling TAPS sequencing output, but also contains functions that are useful for bisulfite sequencing data analyses. It was shown that TAPS has more even coverage than BS while having a comparable conversion rate over CpGs, and is applicable to low input samples. A Deep Neural Network (DNN) model that detects single-nucleotide variants in TAPS- and BS-converted sequencing data was created to enable sensitive modification and variant calling. The algorithm showed precision and recall above 0.9 for classifying variants, modifications and reference positions. The model outperformed available variant callers for whole-genome sequencing and BS data. Applying such a model on real datasets could improve the accuracy of identifying real DNA modifications masked by genetic variation and errors, as around a sixth of all SNPs could be misclassified as modifications.</p>
spellingShingle Bioinformatics
Data Science
Method Development
Biomedical Sciences
Velikova, GV
Development and application of computational methods to study DNA modifications
title Development and application of computational methods to study DNA modifications
title_full Development and application of computational methods to study DNA modifications
title_fullStr Development and application of computational methods to study DNA modifications
title_full_unstemmed Development and application of computational methods to study DNA modifications
title_short Development and application of computational methods to study DNA modifications
title_sort development and application of computational methods to study dna modifications
topic Bioinformatics
Data Science
Method Development
Biomedical Sciences
work_keys_str_mv AT velikovagv developmentandapplicationofcomputationalmethodstostudydnamodifications