Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine

Abstract One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop diagnostic reasoning prompts to s...

Full description

Bibliographic Details
Main Authors: Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H. Chen
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-024-01010-1
_version_ 1797276276363362304
author Thomas Savage
Ashwin Nayak
Robert Gallo
Ekanath Rangan
Jonathan H. Chen
author_facet Thomas Savage
Ashwin Nayak
Robert Gallo
Ekanath Rangan
Jonathan H. Chen
author_sort Thomas Savage
collection DOAJ
description Abstract One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop diagnostic reasoning prompts to study whether LLMs can imitate clinical reasoning while accurately forming a diagnosis. We find that GPT-4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can imitate clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether an LLMs response is likely correct and can be trusted for patient care. Prompting methods that use diagnostic reasoning have the potential to mitigate the “black box” limitations of LLMs, bringing them one step closer to safe and effective use in medicine.
first_indexed 2024-03-07T15:25:55Z
format Article
id doaj.art-949bf0bf6eff4ff1851bbf287b8ad038
institution Directory Open Access Journal
issn 2398-6352
language English
last_indexed 2024-03-07T15:25:55Z
publishDate 2024-01-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj.art-949bf0bf6eff4ff1851bbf287b8ad0382024-03-05T17:06:39ZengNature Portfolionpj Digital Medicine2398-63522024-01-01711710.1038/s41746-024-01010-1Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicineThomas Savage0Ashwin Nayak1Robert Gallo2Ekanath Rangan3Jonathan H. Chen4Department of Medicine, Stanford UniversityDepartment of Medicine, Stanford UniversityPalo Alto Veterans Affairs Medical CenterDepartment of Medicine, Stanford UniversityDepartment of Medicine, Stanford UniversityAbstract One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop diagnostic reasoning prompts to study whether LLMs can imitate clinical reasoning while accurately forming a diagnosis. We find that GPT-4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can imitate clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether an LLMs response is likely correct and can be trusted for patient care. Prompting methods that use diagnostic reasoning have the potential to mitigate the “black box” limitations of LLMs, bringing them one step closer to safe and effective use in medicine.https://doi.org/10.1038/s41746-024-01010-1
spellingShingle Thomas Savage
Ashwin Nayak
Robert Gallo
Ekanath Rangan
Jonathan H. Chen
Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
npj Digital Medicine
title Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
title_full Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
title_fullStr Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
title_full_unstemmed Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
title_short Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
title_sort diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
url https://doi.org/10.1038/s41746-024-01010-1
work_keys_str_mv AT thomassavage diagnosticreasoningpromptsrevealthepotentialforlargelanguagemodelinterpretabilityinmedicine
AT ashwinnayak diagnosticreasoningpromptsrevealthepotentialforlargelanguagemodelinterpretabilityinmedicine
AT robertgallo diagnosticreasoningpromptsrevealthepotentialforlargelanguagemodelinterpretabilityinmedicine
AT ekanathrangan diagnosticreasoningpromptsrevealthepotentialforlargelanguagemodelinterpretabilityinmedicine
AT jonathanhchen diagnosticreasoningpromptsrevealthepotentialforlargelanguagemodelinterpretabilityinmedicine