Genomic Language Models for Protein Function and Property Prediction

In the field of natural language processing (NLP), large language models (LLMs) trained on enormous corpora of unlabeled sequence data have demonstrated state-of-the-art performance on a variety of downstream tasks. This approach is appealing because one model can be easily adapted to do well in man...

Full description

Bibliographic Details
Main Author: Boshar, Sam T.
Other Authors: Kellis, Manolis
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156816