A probabilistic model for the evolution of RNA structure

Background. For the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars (and hence multiple alignment algorithms) systematically, starting from hypotheses about the various...

Full beskrivning

Bibliografiska uppgifter
Materialtyp:	Journal article
Publicerad:	Biomed Central 2004
Ämnen:	Life Sciences Biochemistry

_version_	1826285373121101824
collection	OXFORD
description	Background. For the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars (and hence multiple alignment algorithms) systematically, starting from hypotheses about the various kinds of random mutation event and their rates. Results. Here, we consider a highly simplified evolutionary model for RNA, called “The TKF91 Structure Tree” (following Thorne, Kishino and Felsenstein’s 1991 model of sequence evolution with indels), which we have implemented for pairwise alignment as proof of principle for such an approach. The model, its strengths and its weaknesses are discussed with reference to four examples of functional ncRNA sequences: a riboswitch (guanine), a zipcode (nanos), a splicing factor (U4) and a ribozyme (RNase P). As shown by our visualisations of posterior probability matrices, the selected examples illustrate three different signatures of natural selection that are highly characteristic of ncRNA: (i) co-ordinated basepair substitutions, (ii) co-ordinated basepair indels and (iii) whole-stem indels. Conclusions. Although all three types of mutation “event” are built into our model, events of type (i) and (ii) are found to be better modeled than events of type (iii). Nevertheless, we hypothesise from the model’s performance on pairwise alignments that it would form an adequate basis for a prototype multiple alignment and genefinding tool. Background One of the promises of comparative genomics is to annotate previously undetectable functional signals in genomic sequence, by identifying and characterising evolutionarily conserved elements. A principled way to extract such signals is by fitting the data to probabilistic models of the molecular evolutionary process. The logic runs as follows: suppose there are various kinds of conserved element x, y, z . . . (e.g. exons, bits of RNA, promoters, etc) that might explain an observed sequence homology. For each of these scenarios, we can construct a probabilistic model Mx,My,Mz . . . and compare the likelihood of the observed data under each of these models. The model with the best fit indicates the type of functional element present in the sequence.
first_indexed	2024-03-07T01:27:50Z
format	Journal article
id	oxford-uuid:9290362c-9921-4b12-8bcb-8853a7304184
institution	University of Oxford
last_indexed	2024-03-07T01:27:50Z
publishDate	2004
publisher	Biomed Central
record_format	dspace
spelling	oxford-uuid:9290362c-9921-4b12-8bcb-8853a73041842022-03-26T23:26:23ZA probabilistic model for the evolution of RNA structureJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:9290362c-9921-4b12-8bcb-8853a7304184Life SciencesBiochemistryOxford University Research Archive - ValetBiomed Central2004Background. For the purposes of finding and aligning noncoding RNA gene- and cis-regulatory elements in multiple-genome datasets, it is useful to be able to derive multi-sequence stochastic grammars (and hence multiple alignment algorithms) systematically, starting from hypotheses about the various kinds of random mutation event and their rates. Results. Here, we consider a highly simplified evolutionary model for RNA, called “The TKF91 Structure Tree” (following Thorne, Kishino and Felsenstein’s 1991 model of sequence evolution with indels), which we have implemented for pairwise alignment as proof of principle for such an approach. The model, its strengths and its weaknesses are discussed with reference to four examples of functional ncRNA sequences: a riboswitch (guanine), a zipcode (nanos), a splicing factor (U4) and a ribozyme (RNase P). As shown by our visualisations of posterior probability matrices, the selected examples illustrate three different signatures of natural selection that are highly characteristic of ncRNA: (i) co-ordinated basepair substitutions, (ii) co-ordinated basepair indels and (iii) whole-stem indels. Conclusions. Although all three types of mutation “event” are built into our model, events of type (i) and (ii) are found to be better modeled than events of type (iii). Nevertheless, we hypothesise from the model’s performance on pairwise alignments that it would form an adequate basis for a prototype multiple alignment and genefinding tool. Background One of the promises of comparative genomics is to annotate previously undetectable functional signals in genomic sequence, by identifying and characterising evolutionarily conserved elements. A principled way to extract such signals is by fitting the data to probabilistic models of the molecular evolutionary process. The logic runs as follows: suppose there are various kinds of conserved element x, y, z . . . (e.g. exons, bits of RNA, promoters, etc) that might explain an observed sequence homology. For each of these scenarios, we can construct a probabilistic model Mx,My,Mz . . . and compare the likelihood of the observed data under each of these models. The model with the best fit indicates the type of functional element present in the sequence.
spellingShingle	Life Sciences Biochemistry A probabilistic model for the evolution of RNA structure
title	A probabilistic model for the evolution of RNA structure
title_full	A probabilistic model for the evolution of RNA structure
title_fullStr	A probabilistic model for the evolution of RNA structure
title_full_unstemmed	A probabilistic model for the evolution of RNA structure
title_short	A probabilistic model for the evolution of RNA structure
title_sort	probabilistic model for the evolution of rna structure
topic	Life Sciences Biochemistry

A probabilistic model for the evolution of RNA structure

Liknande verk