A Unified View of Protein Low-complexity Regions (LCRs) Across Species

Low-complexity regions (LCRs) in proteins play a role in a variety of important cellular processes, dispersed across different fields in biology such as transcription, extracellular structure, and stress response. LCRs have been shown to vary in amino acid composition and structure, and can act as i...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsman: Lee, Byron
Övriga upphovsmän: Calo, Eliezer
Materialtyp: Lärdomsprov
Publicerad: Massachusetts Institute of Technology 2023
Länkar:https://hdl.handle.net/1721.1/150241
https://orcid.org/0000-0001-7132-2662
_version_ 1826211805910794240
author Lee, Byron
author2 Calo, Eliezer
author_facet Calo, Eliezer
Lee, Byron
author_sort Lee, Byron
collection MIT
description Low-complexity regions (LCRs) in proteins play a role in a variety of important cellular processes, dispersed across different fields in biology such as transcription, extracellular structure, and stress response. LCRs have been shown to vary in amino acid composition and structure, and can act as interacting domains capable of forming phase-separated higher-order assemblies. However, we lack a unified view of LCRs that incorporates all of the information in their sequences, features, relationships, and functions. In this thesis, I present a unified view of LCRs by 1) co-developing a framework based on the features and relationships of LCRs which are important in their roles as versatile interacting and phase-separating domains and 2) seeing whether this framework may provide a more general understanding of the functions of LCRs in proteins. Using the systematic dotplot matrix approach that we developed, we define LCR type/copy relationships for proteins across the proteome. Based on these definitions, we show the importance of K-rich LCR copy number for the RNA polymerase I subunit RPA43 for both assembly in vitro and localization in cells, demonstrating how principles of LCR copy number can relate these two processes. Moreover, by mapping regions of LCR sequence space to higher-order assemblies, such as the nucleolus, metazoan extracellular matrix and plant cell wall, we relate LCR functions across different fields and suggest that LCR functions may be unified in their roles in higher-order assemblies. Using this unified view, we uncover scaffold-client relationships among E-rich LCR-containing proteins in the nucleolus and discover TCOF1 as a self-assembling scaffold of the nucleolar fibrillar center. We go on to uncover previously undescribed regions of LCR sequence space with signatures of higher-order assemblies, including a teleost-specific T/H-rich sequence space. Thus, this work provides a framework that can unify the disparate functions of LCRs and enables discovery of how LCRs encode higher-order assemblies of organisms.
first_indexed 2024-09-23T15:11:21Z
format Thesis
id mit-1721.1/150241
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:11:21Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1502412023-04-01T03:11:15Z A Unified View of Protein Low-complexity Regions (LCRs) Across Species Lee, Byron Calo, Eliezer Massachusetts Institute of Technology. Department of Biology Low-complexity regions (LCRs) in proteins play a role in a variety of important cellular processes, dispersed across different fields in biology such as transcription, extracellular structure, and stress response. LCRs have been shown to vary in amino acid composition and structure, and can act as interacting domains capable of forming phase-separated higher-order assemblies. However, we lack a unified view of LCRs that incorporates all of the information in their sequences, features, relationships, and functions. In this thesis, I present a unified view of LCRs by 1) co-developing a framework based on the features and relationships of LCRs which are important in their roles as versatile interacting and phase-separating domains and 2) seeing whether this framework may provide a more general understanding of the functions of LCRs in proteins. Using the systematic dotplot matrix approach that we developed, we define LCR type/copy relationships for proteins across the proteome. Based on these definitions, we show the importance of K-rich LCR copy number for the RNA polymerase I subunit RPA43 for both assembly in vitro and localization in cells, demonstrating how principles of LCR copy number can relate these two processes. Moreover, by mapping regions of LCR sequence space to higher-order assemblies, such as the nucleolus, metazoan extracellular matrix and plant cell wall, we relate LCR functions across different fields and suggest that LCR functions may be unified in their roles in higher-order assemblies. Using this unified view, we uncover scaffold-client relationships among E-rich LCR-containing proteins in the nucleolus and discover TCOF1 as a self-assembling scaffold of the nucleolar fibrillar center. We go on to uncover previously undescribed regions of LCR sequence space with signatures of higher-order assemblies, including a teleost-specific T/H-rich sequence space. Thus, this work provides a framework that can unify the disparate functions of LCRs and enables discovery of how LCRs encode higher-order assemblies of organisms. Ph.D. 2023-03-31T14:42:04Z 2023-03-31T14:42:04Z 2023-02 2023-03-03T06:01:29.017Z Thesis https://hdl.handle.net/1721.1/150241 https://orcid.org/0000-0001-7132-2662 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Lee, Byron
A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title_full A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title_fullStr A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title_full_unstemmed A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title_short A Unified View of Protein Low-complexity Regions (LCRs) Across Species
title_sort unified view of protein low complexity regions lcrs across species
url https://hdl.handle.net/1721.1/150241
https://orcid.org/0000-0001-7132-2662
work_keys_str_mv AT leebyron aunifiedviewofproteinlowcomplexityregionslcrsacrossspecies
AT leebyron unifiedviewofproteinlowcomplexityregionslcrsacrossspecies