A study of generative large language model for medical research and healthcare

Abstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billi...

Full description

Bibliographic Details
Main Authors:	Cheng Peng, Xi Yang, Aokun Chen, Kaleb E. Smith, Nima PourNejatian, Anthony B. Costa, Cheryl Martin, Mona G. Flores, Ying Zhang, Tanja Magoc, Gloria Lipori, Duane A. Mitchell, Naykky S. Ospina, Mustafa M. Ahmed, William R. Hogan, Elizabeth A. Shenkman, Yi Guo, Jiang Bian, Yonghui Wu
Format:	Article
Language:	English
Published:	Nature Portfolio 2023-11-01
Series:	npj Digital Medicine
Online Access:	https://doi.org/10.1038/s41746-023-00958-w

_version_	1797556268108349440
author	Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu
author_facet	Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu
author_sort	Cheng Peng
collection	DOAJ
description	Abstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians’ Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.
first_indexed	2024-03-10T17:00:24Z
format	Article
id	doaj.art-ef2ddec4ebcf44af961dc0a1bd072263
institution	Directory Open Access Journal
issn	2398-6352
language	English
last_indexed	2024-03-10T17:00:24Z
publishDate	2023-11-01
publisher	Nature Portfolio
record_format	Article
series	npj Digital Medicine
spelling	doaj.art-ef2ddec4ebcf44af961dc0a1bd0722632023-11-20T11:00:52ZengNature Portfolionpj Digital Medicine2398-63522023-11-016111010.1038/s41746-023-00958-wA study of generative large language model for medical research and healthcareCheng Peng0Xi Yang1Aokun Chen2Kaleb E. Smith3Nima PourNejatian4Anthony B. Costa5Cheryl Martin6Mona G. Flores7Ying Zhang8Tanja Magoc9Gloria Lipori10Duane A. Mitchell11Naykky S. Ospina12Mustafa M. Ahmed13William R. Hogan14Elizabeth A. Shenkman15Yi Guo16Jiang Bian17Yonghui Wu18Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaNVIDIANVIDIANVIDIANVIDIANVIDIAResearch Computing, University of FloridaIntegrated Data Repository Research Services, University of FloridaIntegrated Data Repository Research Services, University of FloridaLillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of FloridaDivision of Endocrinology, Department of Medicine, College of Medicine, University of FloridaDivision of Cardiovascular Medicine, Department of Medicine, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaAbstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians’ Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.https://doi.org/10.1038/s41746-023-00958-w
spellingShingle	Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu A study of generative large language model for medical research and healthcare npj Digital Medicine
title	A study of generative large language model for medical research and healthcare
title_full	A study of generative large language model for medical research and healthcare
title_fullStr	A study of generative large language model for medical research and healthcare
title_full_unstemmed	A study of generative large language model for medical research and healthcare
title_short	A study of generative large language model for medical research and healthcare
title_sort	study of generative large language model for medical research and healthcare
url	https://doi.org/10.1038/s41746-023-00958-w
work_keys_str_mv	AT chengpeng astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT xiyang astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT aokunchen astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT kalebesmith astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT nimapournejatian astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT anthonybcosta astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT cherylmartin astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT monagflores astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yingzhang astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT tanjamagoc astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT glorialipori astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT duaneamitchell astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT naykkysospina astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT mustafamahmed astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT williamrhogan astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT elizabethashenkman astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yiguo astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT jiangbian astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yonghuiwu astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT chengpeng studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT xiyang studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT aokunchen studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT kalebesmith studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT nimapournejatian studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT anthonybcosta studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT cherylmartin studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT monagflores studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yingzhang studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT tanjamagoc studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT glorialipori studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT duaneamitchell studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT naykkysospina studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT mustafamahmed studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT williamrhogan studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT elizabethashenkman studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yiguo studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT jiangbian studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yonghuiwu studyofgenerativelargelanguagemodelformedicalresearchandhealthcare

A study of generative large language model for medical research and healthcare

Similar Items