Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam
Introduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Sen...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2024-02-01
|
Series: | Digital Health |
Online Access: | https://doi.org/10.1177/20552076241233144 |
_version_ | 1797304072096710656 |
---|---|
author | Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao |
author_facet | Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao |
author_sort | Chao-Hsiung Huang |
collection | DOAJ |
description | Introduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Senior Professional and Technical Examinations for Medical Doctors (SPTEMD) in Taiwan, with questions that encompass both Chinese and English. Methods In this study, we tested ChatGPT-4 to complete SPTEMD Stage 1. The model was presented with multiple-choice questions extracted from three separate tests conducted in February 2022, July 2022, and February 2023. These questions encompass 10 subjects, namely biochemistry and molecular biology, anatomy, embryology and developmental biology, histology, physiology, microbiology and immunology, parasitology, pharmacology, pathology, and public health. Subsequently, we analyzed the model's accuracy for each subject. Result In all three tests, ChatGPT achieved scores surpassing the 60% passing threshold, resulting in an overall average score of 87.8%. Notably, its best performance was in biochemistry, where it garnered an average score of 93.8%. Conversely, the performance of the generative pre-trained transformer (GPT)-4 assistant on anatomy, parasitology, and embryology was not as good. In addition, its scores were highly variable in embryology and parasitology. Conclusion ChatGPT has the potential to facilitate not only exam preparation but also improve the accessibility of medical education and support continuous education for medical professionals. In conclusion, this study has demonstrated ChatGPT's potential competence across various subjects within the SPTEMD Stage 1 and suggests that it could be a helpful tool for learning and exam preparation for medical students and professionals. |
first_indexed | 2024-03-08T00:04:10Z |
format | Article |
id | doaj.art-73ff470f38f44d9694914782580ade09 |
institution | Directory Open Access Journal |
issn | 2055-2076 |
language | English |
last_indexed | 2024-03-08T00:04:10Z |
publishDate | 2024-02-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Digital Health |
spelling | doaj.art-73ff470f38f44d9694914782580ade092024-02-17T11:03:52ZengSAGE PublishingDigital Health2055-20762024-02-011010.1177/20552076241233144Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing examChao-Hsiung Huang0Han-Jung Hsiao1Pei-Chun Yeh2Kuo-Chen Wu3Chia-Hung Kao4 School of Medicine, , Taichung Artificial Intelligence Center, , China Medical University, Taichung Artificial Intelligence Center, , China Medical University, Taichung Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei Department of Bioinformatics and Medical Engineering, Asia University, TaichungIntroduction Since its release by OpenAI in November 2022, numerous studies have subjected ChatGPT to various tests to evaluate its performance in medical exams. The objective of this study is to evaluate ChatGPT's accuracy and logical reasoning across all 10 subjects featured in Stage 1 of Senior Professional and Technical Examinations for Medical Doctors (SPTEMD) in Taiwan, with questions that encompass both Chinese and English. Methods In this study, we tested ChatGPT-4 to complete SPTEMD Stage 1. The model was presented with multiple-choice questions extracted from three separate tests conducted in February 2022, July 2022, and February 2023. These questions encompass 10 subjects, namely biochemistry and molecular biology, anatomy, embryology and developmental biology, histology, physiology, microbiology and immunology, parasitology, pharmacology, pathology, and public health. Subsequently, we analyzed the model's accuracy for each subject. Result In all three tests, ChatGPT achieved scores surpassing the 60% passing threshold, resulting in an overall average score of 87.8%. Notably, its best performance was in biochemistry, where it garnered an average score of 93.8%. Conversely, the performance of the generative pre-trained transformer (GPT)-4 assistant on anatomy, parasitology, and embryology was not as good. In addition, its scores were highly variable in embryology and parasitology. Conclusion ChatGPT has the potential to facilitate not only exam preparation but also improve the accessibility of medical education and support continuous education for medical professionals. In conclusion, this study has demonstrated ChatGPT's potential competence across various subjects within the SPTEMD Stage 1 and suggests that it could be a helpful tool for learning and exam preparation for medical students and professionals.https://doi.org/10.1177/20552076241233144 |
spellingShingle | Chao-Hsiung Huang Han-Jung Hsiao Pei-Chun Yeh Kuo-Chen Wu Chia-Hung Kao Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam Digital Health |
title | Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam |
title_full | Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam |
title_fullStr | Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam |
title_full_unstemmed | Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam |
title_short | Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam |
title_sort | performance of chatgpt on stage 1 of the taiwanese medical licensing exam |
url | https://doi.org/10.1177/20552076241233144 |
work_keys_str_mv | AT chaohsiunghuang performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT hanjunghsiao performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT peichunyeh performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT kuochenwu performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam AT chiahungkao performanceofchatgptonstage1ofthetaiwanesemedicallicensingexam |