Genomic variation in human health and disease

<p>Understanding the structure and function of genomic variation within and be- tween individuals will be crucial for the translation of genomics into improved health and clinical outcomes. This thesis addresses current issues around the study of genomic variation in that context.</p> &l...

Full description

Bibliographic Details
Main Author: McCarthy, D
Other Authors: Donnelly, P
Format: Thesis
Language:English
Published: 2015
Description
Summary:<p>Understanding the structure and function of genomic variation within and be- tween individuals will be crucial for the translation of genomics into improved health and clinical outcomes. This thesis addresses current issues around the study of genomic variation in that context.</p> <p>Variant annotation is a vital step in the analysis of whole-genome and whole- exome sequence data. I compared variant annotations for 80 million variants from a clinically-focused whole-genome sequencing study, obtaining annotations with two different sets of transcripts and two different software tools. I found that choice of transcripts and choice of software both have a large effect on variant annotation. The extent of discrepancy in annotations has implications for all research that relies on variant annotation, especially as we try to use whole-genome sequencing in the clinic.</p> <p>Type 2 diabetes (T2D) is a common, complex genetic disease imposing a large global health burden. Although over 80 genomic loci have been associated with increased risk for T2D, many questions remain about the genomic architecture of the disease. I used 11 million rare, low-frequency and common single-nucleotide variants obtained from whole-genome sequence data from 2,657 individuals with and without T2D to assess the contributions of different classes of genomic variation to T2D susceptibility. Using linear mixed model methods and variance partitioning approaches I characterised contributions from variants in different allele frequency classes. Partitioning variance into different functional classes revealed significant four-fold enrichment (<em>P</em> &amp;LT; 0.01) for variants in enhancer regions identified in pancreatic islet cells and significant depletion for variants without any functional annotation (<em>P</em> &amp;LT; 0.01).</p> <p>Single-cell RNA-sequencing (scRNA-seq) technologies are rapidly gaining traction to interrogate transcriptomic heterogeneity across individual cells. How- ever, raw scRNA-seq data require a large amount of processing to obtain a clean, tidy dataset ready for statistical modeling. I have developed a new, self- contained R software package, SCATER, to fill the niche between raw scRNA-seq data and downstream analysis. The package streamlines the pre-processing, quality control and data normalisation procedures while enabling flexible ways to visualise data and integration with other scRNA-seq analysis tools.</p>