Application of the Neutral Indel Model to genome sequences for diverse metazoans

The Neutral Indel Model is able to predict accurately the distribution of indel events in alignments of neutrally evolving genomic sequence. Here, I apply this model to a diverse range of metazoan species pairs, to a number of ends. First, I apply the Neutral Indel Model to alignments of genome sequ...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awdur: Meader, S
Awduron Eraill: Ponting, C
Fformat: Traethawd Ymchwil
Iaith:English
Cyhoeddwyd: 2010
Pynciau:
_version_ 1826315776112459776
author Meader, S
author2 Ponting, C
author_facet Ponting, C
Meader, S
author_sort Meader, S
collection OXFORD
description The Neutral Indel Model is able to predict accurately the distribution of indel events in alignments of neutrally evolving genomic sequence. Here, I apply this model to a diverse range of metazoan species pairs, to a number of ends. First, I apply the Neutral Indel Model to alignments of genome sequences for species within the mammalian clade in order to estimate the quantities of functional DNA shared between species pairs. I demonstrate that as the evolutionary divergence between species pairs increases, estimates of functional sequence drop off dramatically. This pattern is not replicated in extensive simulations of genome sequence alignments, suggesting that functional (and mostly non-coding) sequence is turning over at a rapid rate. I also estimate that between 200 and 300 Mb (6.5-10%) of the human genome is under evolutionary constraint, a considerably higher quantity of sequence than has been estimated by previous whole genome analyses. Second, extending my analyses to consider more diverse metazoan species, I provide estimates for functional bases within organisms’ genomes that appear to mirror our conceptions of organismal complexity. Thirdly, I develop the Neutral Indel Model as a method for assessing genome sequence quality, by quantifying indel errors within alignments of closely related (ds < 0.1) species pairs. Applying this method to six primate genome sequence assemblies, I demonstrate that the frequency of indel error events per base varies up to six-fold. Further to this, I show that second generation sequencing technologies can be used to create high quality genome sequence assemblies and to ameliorate errors in pre-existing assemblies. Finally, I analyse patterns of indel mutations in primate transposable elements and show that indels are not randomly distributed within these sequences due to regularly spaced homo-nucleotide motifs.
first_indexed 2024-03-06T19:17:41Z
format Thesis
id oxford-uuid:18f8c5fc-28f2-4d5e-aa87-c1086582213c
institution University of Oxford
language English
last_indexed 2024-12-09T03:32:14Z
publishDate 2010
record_format dspace
spelling oxford-uuid:18f8c5fc-28f2-4d5e-aa87-c1086582213c2024-12-01T15:31:53ZApplication of the Neutral Indel Model to genome sequences for diverse metazoansThesishttp://purl.org/coar/resource_type/c_db06uuid:18f8c5fc-28f2-4d5e-aa87-c1086582213cEvolution (zoology)Bioinformatics (life sciences)Mathematical genetics and bioinformatics (statistics)Genetics (life sciences)EnglishOxford University Research Archive - Valet2010Meader, SPonting, CThe Neutral Indel Model is able to predict accurately the distribution of indel events in alignments of neutrally evolving genomic sequence. Here, I apply this model to a diverse range of metazoan species pairs, to a number of ends. First, I apply the Neutral Indel Model to alignments of genome sequences for species within the mammalian clade in order to estimate the quantities of functional DNA shared between species pairs. I demonstrate that as the evolutionary divergence between species pairs increases, estimates of functional sequence drop off dramatically. This pattern is not replicated in extensive simulations of genome sequence alignments, suggesting that functional (and mostly non-coding) sequence is turning over at a rapid rate. I also estimate that between 200 and 300 Mb (6.5-10%) of the human genome is under evolutionary constraint, a considerably higher quantity of sequence than has been estimated by previous whole genome analyses. Second, extending my analyses to consider more diverse metazoan species, I provide estimates for functional bases within organisms’ genomes that appear to mirror our conceptions of organismal complexity. Thirdly, I develop the Neutral Indel Model as a method for assessing genome sequence quality, by quantifying indel errors within alignments of closely related (ds < 0.1) species pairs. Applying this method to six primate genome sequence assemblies, I demonstrate that the frequency of indel error events per base varies up to six-fold. Further to this, I show that second generation sequencing technologies can be used to create high quality genome sequence assemblies and to ameliorate errors in pre-existing assemblies. Finally, I analyse patterns of indel mutations in primate transposable elements and show that indels are not randomly distributed within these sequences due to regularly spaced homo-nucleotide motifs.
spellingShingle Evolution (zoology)
Bioinformatics (life sciences)
Mathematical genetics and bioinformatics (statistics)
Genetics (life sciences)
Meader, S
Application of the Neutral Indel Model to genome sequences for diverse metazoans
title Application of the Neutral Indel Model to genome sequences for diverse metazoans
title_full Application of the Neutral Indel Model to genome sequences for diverse metazoans
title_fullStr Application of the Neutral Indel Model to genome sequences for diverse metazoans
title_full_unstemmed Application of the Neutral Indel Model to genome sequences for diverse metazoans
title_short Application of the Neutral Indel Model to genome sequences for diverse metazoans
title_sort application of the neutral indel model to genome sequences for diverse metazoans
topic Evolution (zoology)
Bioinformatics (life sciences)
Mathematical genetics and bioinformatics (statistics)
Genetics (life sciences)
work_keys_str_mv AT meaders applicationoftheneutralindelmodeltogenomesequencesfordiversemetazoans