Practical Considerations For the Deployment of Clinical NLP Systems

Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as healthcare. A healthcare system...

Full description

Bibliographic Details
Main Author: Lehman, Eric
Other Authors: Szolovits, Peter
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156307
_version_ 1826189636844650496
author Lehman, Eric
author2 Szolovits, Peter
author_facet Szolovits, Peter
Lehman, Eric
author_sort Lehman, Eric
collection MIT
description Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as healthcare. A healthcare system attempting to automate a clinical task must weigh all approaches with respect to safety, efficacy, and efficiency. This thesis investigates the challenges and implications of implementing LLMs in clinical settings, focusing on the three considerations listed above: safety, efficacy, and efficiency. We first explore the potential biases that might be introduced in downstream patient safety by using LLMs in a zero or few-shot setting and find that LLMs can propagate, or even amplify, harmful societal biases in a number of clinical tasks. Then, we examine the privacy considerations of pretraining a language model on protected health information (PHI) bearing clinical text and find that simple probing methods are unable to meaningfully extract sensitive information from an encoder-only language model pretrained on non-deidentified electronic health record (EHR) notes. Finally, we conduct an extensive empirical analysis of 12 language models, ranging from 220M to 175B parameters, measuring their performance on 3 different clinical tasks that test their ability to parse and reason over electronic health records. We show that relatively small specialized clinical models are substantially more effective than larger models trained on general text used through in-context learning. Further, we find that pretraining on clinical tokens allows for smaller, more parameter-efficient models that either match or outperform much larger language models trained on general text. We argue that using a clinical text-specific pretrained language model allows for an efficient, effective, and privacy-conscious approach, enabling a tailored and ethically responsible application of AI in healthcare.
first_indexed 2024-09-23T08:18:45Z
format Thesis
id mit-1721.1/156307
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T08:18:45Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1563072024-08-22T04:02:52Z Practical Considerations For the Deployment of Clinical NLP Systems Lehman, Eric Szolovits, Peter Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Although recent advances in scaling large language models (LLMs) have resulted in improvements on many NLP tasks, it remains unclear whether these models trained primarily with general web text are the right tool in highly specialized, safety critical domains such as healthcare. A healthcare system attempting to automate a clinical task must weigh all approaches with respect to safety, efficacy, and efficiency. This thesis investigates the challenges and implications of implementing LLMs in clinical settings, focusing on the three considerations listed above: safety, efficacy, and efficiency. We first explore the potential biases that might be introduced in downstream patient safety by using LLMs in a zero or few-shot setting and find that LLMs can propagate, or even amplify, harmful societal biases in a number of clinical tasks. Then, we examine the privacy considerations of pretraining a language model on protected health information (PHI) bearing clinical text and find that simple probing methods are unable to meaningfully extract sensitive information from an encoder-only language model pretrained on non-deidentified electronic health record (EHR) notes. Finally, we conduct an extensive empirical analysis of 12 language models, ranging from 220M to 175B parameters, measuring their performance on 3 different clinical tasks that test their ability to parse and reason over electronic health records. We show that relatively small specialized clinical models are substantially more effective than larger models trained on general text used through in-context learning. Further, we find that pretraining on clinical tokens allows for smaller, more parameter-efficient models that either match or outperform much larger language models trained on general text. We argue that using a clinical text-specific pretrained language model allows for an efficient, effective, and privacy-conscious approach, enabling a tailored and ethically responsible application of AI in healthcare. Ph.D. 2024-08-21T18:55:34Z 2024-08-21T18:55:34Z 2024-05 2024-07-10T13:01:41.746Z Thesis https://hdl.handle.net/1721.1/156307 0000-0001-9919-2257 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Lehman, Eric
Practical Considerations For the Deployment of Clinical NLP Systems
title Practical Considerations For the Deployment of Clinical NLP Systems
title_full Practical Considerations For the Deployment of Clinical NLP Systems
title_fullStr Practical Considerations For the Deployment of Clinical NLP Systems
title_full_unstemmed Practical Considerations For the Deployment of Clinical NLP Systems
title_short Practical Considerations For the Deployment of Clinical NLP Systems
title_sort practical considerations for the deployment of clinical nlp systems
url https://hdl.handle.net/1721.1/156307
work_keys_str_mv AT lehmaneric practicalconsiderationsforthedeploymentofclinicalnlpsystems