Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes
The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Cold Spring Harbor Laboratory Press
2012
|
Online Access: | http://hdl.handle.net/1721.1/73052 |
_version_ | 1826204866749399040 |
---|---|
author | Lin, Michael F. Kheradpour, Pouya Mag Washietl, Stefan Parker, Brian J. Pedersen, Jakob S. Kellis, Manolis |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Lin, Michael F. Kheradpour, Pouya Mag Washietl, Stefan Parker, Brian J. Pedersen, Jakob S. Kellis, Manolis |
author_sort | Lin, Michael F. |
collection | MIT |
description | The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes—especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ∼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape. |
first_indexed | 2024-09-23T13:02:37Z |
format | Article |
id | mit-1721.1/73052 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:02:37Z |
publishDate | 2012 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | dspace |
spelling | mit-1721.1/730522022-10-01T12:43:11Z Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes Lin, Michael F. Kheradpour, Pouya Mag Washietl, Stefan Parker, Brian J. Pedersen, Jakob S. Kellis, Manolis Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kellis, Manolis Lin, Michael F. Mag Washietl, Stefan Kellis, Manolis The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes—especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ∼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape. National Science Foundation (U.S.) (DBI 0644282) National Institutes of Health (U.S.) (U54 HG004555-01) 2012-09-19T17:56:09Z 2012-09-19T17:56:09Z 2011-10 2010-04 Article http://purl.org/eprint/type/JournalArticle 1088-9051 http://hdl.handle.net/1721.1/73052 Lin, M. F. et al. “Locating Protein-coding Sequences Under Selection for Additional, Overlapping Functions in 29 Mammalian Genomes.” Genome Research 21.11 (2011): 1916–1928. © 2011 by Cold Spring Harbor Laboratory Press en_US http://dx.doi.org/10.1101/gr.108753.110 Genome Research Creative Commons Attribution-NonCommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/ application/pdf Cold Spring Harbor Laboratory Press Genome Research |
spellingShingle | Lin, Michael F. Kheradpour, Pouya Mag Washietl, Stefan Parker, Brian J. Pedersen, Jakob S. Kellis, Manolis Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title | Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title_full | Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title_fullStr | Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title_full_unstemmed | Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title_short | Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes |
title_sort | locating protein coding sequences under selection for additional overlapping functions in 29 mammalian genomes |
url | http://hdl.handle.net/1721.1/73052 |
work_keys_str_mv | AT linmichaelf locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes AT kheradpourpouya locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes AT magwashietlstefan locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes AT parkerbrianj locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes AT pedersenjakobs locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes AT kellismanolis locatingproteincodingsequencesunderselectionforadditionaloverlappingfunctionsin29mammaliangenomes |