Searching for cancer biomarkers in the human epigenetic landscape

<p>Methylation and hydroxymethylation are key epigenetic regulators of the genome. Without changing the underlying sequence, these epigenetic modifications allow cells to orchestrate highly specialised gene expression programs, up-regulating genes with important tissue-specific functions and s...

Full description

Bibliographic Details
Main Author: Jackson, F
Other Authors: Song, C
Format: Thesis
Language:English
Published: 2024
Subjects:
Description
Summary:<p>Methylation and hydroxymethylation are key epigenetic regulators of the genome. Without changing the underlying sequence, these epigenetic modifications allow cells to orchestrate highly specialised gene expression programs, up-regulating genes with important tissue-specific functions and silencing genes used elsewhere. Methylation patterns across the genome are highly tissue-specific, which makes them useful candidate biomarkers for a variety of diagnostic tests. Methylation changes are also emerging as a consistent early harbinger of malignancy in cancer. There is a pressing need for non-invasive cancer diagnostic tests, and therefore intense interest in the development of methylation-based cancer diagnostics.</p> <p>In this thesis, I present novel methods and data focused on gaining a better understanding of genome-wide methylation and hydroxymethylation patterns, and explore how these insights can be utilised for non-invasive cancer diagnosis. I collate a comprehensive set of genome-wide tissue-informative markers from public data, then use these markers to perform deconvolution of cell-free DNA, obtained from liquid biopsy of cancer patients. I identify significantly increased liver tumour contribution in liver cancer patients, and show the inferred tissue contribution can be used to distinguish cancer patients from non-cancer controls. In clinical contexts, patient data is generally scarce and a limiting factor for diagnostic models. I devise new machine-learning based cfDNA deconvolution and classification algorithms with more efficient data use and improved diagnostic accuracy.</p> <p>The availability of high quality training data is a major bottleneck for these models, and in the second half of this thesis I focus on collating and analysing a new resource dataset: genome-wide methylation and hydroxymethylation signatures for 22 healthy tissues and 10 tumour types at the resolution of individual CpGs. I develop a novel spatial segmentation method to disaggregate the genome into functional building blocks of methylation and hydroxymethylation, then identify tissue and tumour informative regions within these blocks. Finally, I explore the downstream functional consequences of methylation and hydroxymethylation by building a gene expression prediction model that can accurately predict gene expression profiles in unseen healthy and tumour tissues, using these epigenetic marks alone.</p>