Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam

Introduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Sen...

Full description

Bibliographic Details
Main Authors:	Chao-Hsiung Huang, Han-Jung Hsiao, Pei-Chun Yeh, Kuo-Chen Wu, Chia-Hung Kao
Format:	Article
Language:	English
Published:	SAGE Publishing 2024-02-01
Series:	Digital Health
Online Access:	https://doi.org/10.1177/20552076241233144

_version_	1797304072096710656
author	Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao
author_facet	Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao
author_sort	Chao-Hsiung Huang
collection	DOAJ
description	Introduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Senior Professional and Technical Examinations for Medical Doctors (SPTEMD) in Taiwan, with questions that encompass both Chinese and English. Methods In this study, we tested ChatGPT-4 to complete SPTEMD Stage 1. The model was presented with multiple-choice questions extracted from three separate tests conducted in February 2022, July 2022, and February 2023. These questions encompass 10 subjects, namely biochemistry and molecular biology, anatomy, embryology and developmental biology, histology, physiology, microbiology and immunology, parasitology, pharmacology, pathology, and public health. Subsequently, we analyzed the model's accuracy for each subject. Result In all three tests, ChatGPT achieved scores surpassing the 60% passing threshold, resulting in an overall average score of 87.8%. Notably, its best performance was in biochemistry, where it garnered an average score of 93.8%. Conversely, the performance of the generative pre-trained transformer (GPT)-4 assistant on anatomy, parasitology, and embryology was not as good. In addition, its scores were highly variable in embryology and parasitology. Conclusion ChatGPT has the potential to facilitate not only exam preparation but also improve the accessibility of medical education and support continuous education for medical professionals. In conclusion, this study has demonstrated ChatGPT's potential competence across various subjects within the SPTEMD Stage 1 and suggests that it could be a helpful tool for learning and exam preparation for medical students and professionals.
first_indexed	2024-03-08T00:04:10Z
format	Article
id	doaj.art-73ff470f38f44d9694914782580ade09
institution	Directory Open Access Journal
issn	2055-2076
language	English
last_indexed	2024-03-08T00:04:10Z
publishDate	2024-02-01
publisher	SAGE Publishing
record_format	Article
series	Digital Health
spelling	doaj.art-73ff470f38f44d9694914782580ade092024-02-17T11:03:52ZengSAGE PublishingDigital Health2055-20762024-02-011010.1177/20552076241233144Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing examChao-Hsiung Huang0Han-Jung Hsiao1Pei-Chun Yeh2Kuo-Chen Wu3Chia-Hung Kao4 School of Medicine, , Taichung Artificial Intelligence Center, , China Medical University, Taichung Artificial Intelligence Center, , China Medical University, Taichung Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei Department of Bioinformatics and Medical Engineering, Asia University, TaichungIntroduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Senior Professional and Technical Examinations for Medical Doctors (SPTEMD) in Taiwan, with questions that encompass both Chinese and English. Methods In this study, we tested ChatGPT-4 to complete SPTEMD Stage 1. The model was presented with multiple-choice questions extracted from three separate tests conducted in February 2022, July 2022, and February 2023. These questions encompass 10 subjects, namely biochemistry and molecular biology, anatomy, embryology and developmental biology, histology, physiology, microbiology and immunology, parasitology, pharmacology, pathology, and public health. Subsequently, we analyzed the model's accuracy for each subject. Result In all three tests, ChatGPT achieved scores surpassing the 60% passing threshold, resulting in an overall average score of 87.8%. Notably, its best performance was in biochemistry, where it garnered an average score of 93.8%. Conversely, the performance of the generative pre-trained transformer (GPT)-4 assistant on anatomy, parasitology, and embryology was not as good. In addition, its scores were highly variable in embryology and parasitology. Conclusion ChatGPT has the potential to facilitate not only exam preparation but also improve the accessibility of medical education and support continuous education for medical professionals. In conclusion, this study has demonstrated ChatGPT's potential competence across various subjects within the SPTEMD Stage 1 and suggests that it could be a helpful tool for learning and exam preparation for medical students and professionals.https://doi.org/10.1177/20552076241233144
spellingShingle	Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam Digital Health
title	Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
title_full	Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
title_fullStr	Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
title_full_unstemmed	Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
title_short	Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
title_sort	performance of chatgpt on stage 1 of the taiwanese medical licensing exam
url	https://doi.org/10.1177/20552076241233144
work_keys_str_mv	AT chaohsiunghuang performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT hanjunghsiao performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT peichunyeh performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT kuochenwu performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT chiahungkao performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam

Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam

Similar Items