RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria

INTRODUCTION: Many cellular RNAs contain regions that fold into stable structures required for function. Both Watson−Crick and noncanonical interactions can play important roles in forming these structures. An intriguing noncanonical structure is the RNA G-quadruplex (RG4), a four-stranded structur...

Full description

Bibliographic Details
Main Authors: Guo, Junjie U., Bartel, David
Other Authors: Massachusetts Institute of Technology. Department of Biology
Format: Article
Language:en_US
Published: American Association for the Advancement of Science (AAAS) 2017
Online Access:http://hdl.handle.net/1721.1/107226
https://orcid.org/0000-0002-3872-2856
Description
Summary:INTRODUCTION: Many cellular RNAs contain regions that fold into stable structures required for function. Both Watson−Crick and noncanonical interactions can play important roles in forming these structures. An intriguing noncanonical structure is the RNA G-quadruplex (RG4), a four-stranded structure containing two or more layers of G-quartets, in which the Watson–Crick face of each of four G residues pairs to the Hoogsteen face of the neighboring G residues. RG4 regions can be very stable in vitro, particularly in the presence of K+, and thus they are generally assumed to be predominantly folded within cells, which have ample K+. Indeed, these structures have been implicated in mRNA processing and translation, with recently proposed roles in cancer and other human diseases. However, the number of cellular RNAs that can fold into RG4 structures has been unclear, as has been the extent to which these RG4 regions are folded in cells. RATIONALE: Enzymes and chemicals that act on RNA with structure-dependent preferences provide valuable tools for detecting and monitoring RNA folding. For example, dimethyl sulfate (DMS) treatment of RNA, either in vitro or in cells, coupled with high-throughput sequencing of abortive primer-extension products can monitor the folding states of many RNAs in one experiment. Analogous high-throughput methods use cell-permeable variants of SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) reagents. These methods reveal important differences between RNA structures formed in vivo and those formed in vitro. However, they are designed to detect Watson−Crick pairing and thus do not identify RG4 structures or provide information on their folding states. After recognizing that RG4 regions can block reverse transcriptase, we reasoned that this property, together with the known ability of RG4s to protect the N7 of participating G nucleotides from DMS modification, could be used to develop a suite of high-throughput methods to both identify endogenous RNAs that can fold into RG4s in vitro and determine whether these regions also fold in cells. RESULTS: We first developed a high-throughput method that identifies RG4 regions on the basis of their propensity to stall reverse transcriptase in a K+-dependent manner. Applying this method to RNA from mammalian cell lines and yeast, we identified >10,000 endogenous regions that form RG4s in vitro, thereby expanding by a factor of >100 the catalog of endogenous regions with experimentally supported propensity to fold into RG4 structures. To infer the folding state of these RG4 regions in vitro and in cells, DMS treatment was performed before profiling of reverse-transcriptase stops. These analyses showed that, in contrast to previous assumptions, regions that folded into RG4 structures in vitro were overwhelmingly unfolded in vivo, as indicated by their accessibility to DMS modification in cells. A complementary probing strategy using a SHAPE reagent confirmed the unfolded state of most RG4 regions in eukaryotic cells. Moreover, RG4 regions remained unfolded both in cells depleted of adenosine 5′-triphosphate and in cells lacking a helicase known to unfold RG4 regions in vitro. Applying our probing methods to bacteria revealed a different behavior, in that model RG4 regions that were unfolded in eukaryotic cells were folded when expressed in Escherichia coli. However, these ectopically expressed quadruplexes impaired mRNA translation and cell growth, which helps explain why very few endogenous sequences that could fold into RG4s were detected in the transcriptomes of E. coli and the two other eubacteria analyzed. CONCLUSION: In mammals, thousands of endogenous RNA sequences have regions that can fold into RG4s in vitro, but these regions are globally unfolded in eukaryotic cells, presumably by robust and effective machinery that remains to be fully characterized. In contrast, RG4 regions are permitted to fold in E. coli cells, but E. coli and other bacteria have undergone evolutionary depletion of endogenous RG4-forming sequences.