Summary: | <p>Next-generation sequencing technologies have transformed our understanding of genetic variation segregating in populations and its relationship with phenotypic traits. Sequencing large populations at low coverage, thus sampling only a fraction of the genome of each individual, may increase statistical power in genetic mapping [Pasaniuc,2012] compared to genotyping arrays. This thesis explores several novel applications of low-coverage population-based sequencing, using data from 488 recombinant inbred lines from the MAGIC population of <em>Arabidopsis thaliana</em>, descended from 19 inbred founder accessions. Based on the full catalogue of genetic variation that is available in the 19 founders [Gan, 2011], I describe every MAGIC genome as a mosaic of founder haplotypes and analyse the accuracy of the mosaics by simulation. I then use the mosaics in three ways. First, I investigate structural variation using a novel method that treats anomalies in the alignment of sequencing reads, potentially representing signatures of structural variants (SVs), as quantitative traits. These can be mapped genetically to identify loci in which genetic variation correlates with signatures of SVs. The method can distinguish short- (e.g. indels) and long-range (e.g. translocations) SVs and has led to the discovery of a large number of SVs segregating in the MAGIC population, including thousands of long-range SVs. I show that SVs have a significant impact on silencing gene expression and that they explain a large fraction of the phenotypic variation in several physiological traits. Second, I use the mosaic structure of the MAGIC lines to map recombination events and analyse lineage-specific recombination in MAGIC. I infer recombination hotspots and compared recombination in the MAGIC lines to the <em>Arabidopsis</em> genetic map. Finally, I detect bacterial endosymbionts hosted in MAGIC genomes from unmapped reads that have high sequence similarity with bacterial DNA and examine whether variation in the presence of endosymbionts can be explained by host genetic variation.</p>
|