Summary: | Proteins serve critical structural and dynamic functional roles at the cellular level of all living organisms. Understanding protein contribution to biological function is critical and rests on having appropriate technologies for quantification and identification. The central dogma of molecular biology, information flow from DNA to RNA to protein, has been studied for decades as these molecules are critical to cell function and diversity. The advent of polymerase chain reaction (PCR) amplification of nucleic acid was pivotal in advancing the high-throughput molecular interrogation and analysis of DNA and RNA at the whole-genome and transcriptome level. In contrast, studying proteins has lagged technologically since there is no equivalent of PCR to amplify and detect low-copy number proteins. Instead, protein sequencing and identification methods have relied on ensemble measurements from many cells which masks cell-to-cell variations¹. While some researchers have turned to transcriptomics as a proxy to the protein composition within cells, it is critical to note that gene expression at the transcriptomic level weakly correlates with the proteomic profile due to variability in translational efficiency of different mRNAs, and the difference between mRNA and protein lifetimes². In addition, posttranslational modifications also result in significant variability of protein abundance and their primary sequence with respect to the transcriptome. Vital biological processes such as synaptic plasticity, metabolic signaling pathways and stem cell differentiation, all depend on protein expression. Many diseases also originate from genetic mutations that are in turn translated to a single or set of aberrant proteins. Diseases such as cancer and neurodegeneration tend to have triggered mutations of unclear origins and polygenic interactions. They can be best understood and addressed at the proteomic level, since their pathology is directly related to disrupted proteostasis at the cellular level³,⁴. The lack of technology for high-resolution protein-level analyses represents a significant gap in advancing important biological research. To address this issue, we propose a technology that allows for single-molecule identification and sequencing of proteins, allowing for high resolution interrogation of the proteome and enabling ultrasensitive diagnostics critical for early detection of diseases. The technology outlined will involve single-molecule detection via labeling and imaging of amino acids, the building blocks of proteins, that are sequentially isolated and immobilized from the protein N-terminus using a novel chemical design called ClickP, thus removing recognition interference from neighboring amino acids.
|