Protein codes promote selective subcellular compartmentalization

Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must assemble. Here, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinat...

Full description

Bibliographic Details
Main Authors: Kilgore, Henry R., Chinn, Itamar, Mikhael, Peter G., Mitnikov, Ilan, Van Dongen, Catherine, Zylberberg, Guy, Afeyan, Lena, Banani, Salman F., Wilson-Hawken, Susana, Ihn Lee, Tong, Barzilay, Regina, Young, Richard A.
Other Authors: Whitehead Institute for Biomedical Research
Format: Article
Language:en_US
Published: American Association for the Advancement of Science 2025
Online Access:https://hdl.handle.net/1721.1/158180
Description
Summary:Cells have evolved mechanisms to distribute ~10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must assemble. Here, we demonstrate that proteins with shared functions share amino acid sequence codes that guide them to compartment destinations. A protein language model, ProtGPS, was developed that predicts with high performance the compartment localization of human proteins excluded from the training set. ProtGPS successfully guided generation of novel protein sequences that selectively assemble in the nucleolus. ProtGPS identified pathological mutations that change this code and lead to altered subcellular localization of proteins. Our results indicate that protein sequences contain not only a folding code, but also a previously unrecognized code governing their distribution to diverse subcellular compartments.