FRESCo: finding regions of excess synonymous constraint in diverse viruses

Background The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded fun...

Full description

Bibliographic Details
Main Authors: Lin, Michael F., Jungreis, Irwin, Kellis, Manolis, Sabeti, Pardis C, Sealfon, Rachel S., Wolf, Maxim Y.
Other Authors: Massachusetts Institute of Technology. Computational and Systems Biology Program
Format: Article
Language:English
Published: BioMed Central 2015
Online Access:http://hdl.handle.net/1721.1/97563
https://orcid.org/0000-0002-8205-9457
_version_ 1811082744804933632
author Lin, Michael F.
Jungreis, Irwin
Kellis, Manolis
Sabeti, Pardis C
Sealfon, Rachel S.
Wolf, Maxim Y.
author2 Massachusetts Institute of Technology. Computational and Systems Biology Program
author_facet Massachusetts Institute of Technology. Computational and Systems Biology Program
Lin, Michael F.
Jungreis, Irwin
Kellis, Manolis
Sabeti, Pardis C
Sealfon, Rachel S.
Wolf, Maxim Y.
author_sort Lin, Michael F.
collection MIT
description Background The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded functional elements. Protein-coding regions in viral genomes often contain overlapping RNA structural elements, reading frames, regulatory elements, microRNAs, and packaging signals. Synonymous substitutions in these regions would be selectively disfavored and thus these regions are characterized by excess synonymous constraint. Codon choice can also modulate transcriptional efficiency, translational accuracy, and protein folding. Results We developed a phylogenetic codon model-based framework, FRESCo, designed to find regions of excess synonymous constraint in short, deep alignments, such as individual viral genes across many sequenced isolates. We demonstrated the high specificity of our approach on simulated data and applied our framework to the protein-coding regions of approximately 30 distinct species of viruses with diverse genome architectures. Conclusions FRESCo recovers known multifunctional regions in well-characterized viruses such as hepatitis B virus, poliovirus, and West Nile virus, often at a single-codon resolution, and predicts many novel functional elements overlapping viral genes, including in Lassa and Ebola viruses. In a number of viruses, the synonymously constrained regions that we identified also display conserved, stable predicted RNA structures, including putative novel elements in multiple viral species.
first_indexed 2024-09-23T12:08:19Z
format Article
id mit-1721.1/97563
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T12:08:19Z
publishDate 2015
publisher BioMed Central
record_format dspace
spelling mit-1721.1/975632022-10-01T08:24:49Z FRESCo: finding regions of excess synonymous constraint in diverse viruses Lin, Michael F. Jungreis, Irwin Kellis, Manolis Sabeti, Pardis C Sealfon, Rachel S. Wolf, Maxim Y. Massachusetts Institute of Technology. Computational and Systems Biology Program Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Sealfon, Rachel S. Jungreis, Irwin Wolf, Maxim Y. Kellis, Manolis Background The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded functional elements. Protein-coding regions in viral genomes often contain overlapping RNA structural elements, reading frames, regulatory elements, microRNAs, and packaging signals. Synonymous substitutions in these regions would be selectively disfavored and thus these regions are characterized by excess synonymous constraint. Codon choice can also modulate transcriptional efficiency, translational accuracy, and protein folding. Results We developed a phylogenetic codon model-based framework, FRESCo, designed to find regions of excess synonymous constraint in short, deep alignments, such as individual viral genes across many sequenced isolates. We demonstrated the high specificity of our approach on simulated data and applied our framework to the protein-coding regions of approximately 30 distinct species of viruses with diverse genome architectures. Conclusions FRESCo recovers known multifunctional regions in well-characterized viruses such as hepatitis B virus, poliovirus, and West Nile virus, often at a single-codon resolution, and predicts many novel functional elements overlapping viral genes, including in Lassa and Ebola viruses. In a number of viruses, the synonymously constrained regions that we identified also display conserved, stable predicted RNA structures, including putative novel elements in multiple viral species. National Science Foundation (U.S.). Graduate Research Fellowship National Institute of Allergy and Infectious Diseases (U.S.) (HHSN272200900049C) 2015-06-29T17:30:11Z 2015-06-29T17:30:11Z 2015-02 2014-12 2015-06-29T08:38:48Z Article http://purl.org/eprint/type/JournalArticle 1465-6906 1474-7596 http://hdl.handle.net/1721.1/97563 Sealfon, Rachel S, Michael F Lin, Irwin Jungreis, Maxim Y Wolf, Manolis Kellis, and Pardis C Sabeti. “FRESCo: Finding Regions of Excess Synonymous Constraint in Diverse Viruses.” Genome Biology 16, no. 1 (February 17, 2015). https://orcid.org/0000-0002-8205-9457 en http://dx.doi.org/10.1186/s13059-015-0603-7 Genome Biology Sealfon et al.; licensee BioMed Central. application/pdf BioMed Central
spellingShingle Lin, Michael F.
Jungreis, Irwin
Kellis, Manolis
Sabeti, Pardis C
Sealfon, Rachel S.
Wolf, Maxim Y.
FRESCo: finding regions of excess synonymous constraint in diverse viruses
title FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_full FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_fullStr FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_full_unstemmed FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_short FRESCo: finding regions of excess synonymous constraint in diverse viruses
title_sort fresco finding regions of excess synonymous constraint in diverse viruses
url http://hdl.handle.net/1721.1/97563
https://orcid.org/0000-0002-8205-9457
work_keys_str_mv AT linmichaelf frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT jungreisirwin frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT kellismanolis frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT sabetipardisc frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT sealfonrachels frescofindingregionsofexcesssynonymousconstraintindiverseviruses
AT wolfmaximy frescofindingregionsofexcesssynonymousconstraintindiverseviruses