A study of generative large language model for medical research and healthcare
Abstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billi...
Main Authors: | , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-11-01
|
Series: | npj Digital Medicine |
Online Access: | https://doi.org/10.1038/s41746-023-00958-w |
_version_ | 1797556268108349440 |
---|---|
author | Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu |
author_facet | Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu |
author_sort | Cheng Peng |
collection | DOAJ |
description | Abstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians’ Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare. |
first_indexed | 2024-03-10T17:00:24Z |
format | Article |
id | doaj.art-ef2ddec4ebcf44af961dc0a1bd072263 |
institution | Directory Open Access Journal |
issn | 2398-6352 |
language | English |
last_indexed | 2024-03-10T17:00:24Z |
publishDate | 2023-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | npj Digital Medicine |
spelling | doaj.art-ef2ddec4ebcf44af961dc0a1bd0722632023-11-20T11:00:52ZengNature Portfolionpj Digital Medicine2398-63522023-11-016111010.1038/s41746-023-00958-wA study of generative large language model for medical research and healthcareCheng Peng0Xi Yang1Aokun Chen2Kaleb E. Smith3Nima PourNejatian4Anthony B. Costa5Cheryl Martin6Mona G. Flores7Ying Zhang8Tanja Magoc9Gloria Lipori10Duane A. Mitchell11Naykky S. Ospina12Mustafa M. Ahmed13William R. Hogan14Elizabeth A. Shenkman15Yi Guo16Jiang Bian17Yonghui Wu18Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaNVIDIANVIDIANVIDIANVIDIANVIDIAResearch Computing, University of FloridaIntegrated Data Repository Research Services, University of FloridaIntegrated Data Repository Research Services, University of FloridaLillian S. Wells Department of Neurosurgery, Clinical and Translational Science Institute, University of FloridaDivision of Endocrinology, Department of Medicine, College of Medicine, University of FloridaDivision of Cardiovascular Medicine, Department of Medicine, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaDepartment of Health Outcomes and Biomedical Informatics, College of Medicine, University of FloridaAbstract There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians’ Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.https://doi.org/10.1038/s41746-023-00958-w |
spellingShingle | Cheng Peng Xi Yang Aokun Chen Kaleb E. Smith Nima PourNejatian Anthony B. Costa Cheryl Martin Mona G. Flores Ying Zhang Tanja Magoc Gloria Lipori Duane A. Mitchell Naykky S. Ospina Mustafa M. Ahmed William R. Hogan Elizabeth A. Shenkman Yi Guo Jiang Bian Yonghui Wu A study of generative large language model for medical research and healthcare npj Digital Medicine |
title | A study of generative large language model for medical research and healthcare |
title_full | A study of generative large language model for medical research and healthcare |
title_fullStr | A study of generative large language model for medical research and healthcare |
title_full_unstemmed | A study of generative large language model for medical research and healthcare |
title_short | A study of generative large language model for medical research and healthcare |
title_sort | study of generative large language model for medical research and healthcare |
url | https://doi.org/10.1038/s41746-023-00958-w |
work_keys_str_mv | AT chengpeng astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT xiyang astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT aokunchen astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT kalebesmith astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT nimapournejatian astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT anthonybcosta astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT cherylmartin astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT monagflores astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yingzhang astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT tanjamagoc astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT glorialipori astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT duaneamitchell astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT naykkysospina astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT mustafamahmed astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT williamrhogan astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT elizabethashenkman astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yiguo astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT jiangbian astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yonghuiwu astudyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT chengpeng studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT xiyang studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT aokunchen studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT kalebesmith studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT nimapournejatian studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT anthonybcosta studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT cherylmartin studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT monagflores studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yingzhang studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT tanjamagoc studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT glorialipori studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT duaneamitchell studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT naykkysospina studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT mustafamahmed studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT williamrhogan studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT elizabethashenkman studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yiguo studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT jiangbian studyofgenerativelargelanguagemodelformedicalresearchandhealthcare AT yonghuiwu studyofgenerativelargelanguagemodelformedicalresearchandhealthcare |