Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?

Aims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions ab...

Full description

Bibliographic Details
Main Authors: Benjamin M. Wright, Michael S. Bodnar, Andrew D. Moore, Meghan C. Maseda, Michael P. Kucharik, Connor C. Diaz, Christian M. Schmidt, Hassan R. Mir
Format: Article
Language:English
Published: The British Editorial Society of Bone & Joint Surgery 2024-02-01
Series:Bone & Joint Open
Subjects:
Online Access:https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1
_version_ 1797289594164609024
author Benjamin M. Wright
Michael S. Bodnar
Andrew D. Moore
Meghan C. Maseda
Michael P. Kucharik
Connor C. Diaz
Christian M. Schmidt
Hassan R. Mir
author_facet Benjamin M. Wright
Michael S. Bodnar
Andrew D. Moore
Meghan C. Maseda
Michael P. Kucharik
Connor C. Diaz
Christian M. Schmidt
Hassan R. Mir
author_sort Benjamin M. Wright
collection DOAJ
description Aims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability. Methods: We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, “Please explain so it is easier to understand,” to evaluate ChatGPT’s ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a “yes” or “no” question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered “yes.” Results: The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ2 = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85). Conclusion: ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement. Cite this article: Bone Jt Open 2024;5(2):139–146.
first_indexed 2024-03-07T19:07:23Z
format Article
id doaj.art-5d5d37289639425cbed29e730d8415d4
institution Directory Open Access Journal
issn 2633-1462
language English
last_indexed 2024-03-07T19:07:23Z
publishDate 2024-02-01
publisher The British Editorial Society of Bone & Joint Surgery
record_format Article
series Bone & Joint Open
spelling doaj.art-5d5d37289639425cbed29e730d8415d42024-03-01T06:37:37ZengThe British Editorial Society of Bone & Joint SurgeryBone & Joint Open2633-14622024-02-015213914610.1302/2633-1462.52.BJO-2023-0113.R1Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?Benjamin M. Wright0https://orcid.org/0009-0000-8354-6540Michael S. Bodnar1Andrew D. Moore2Meghan C. Maseda3Michael P. Kucharik4Connor C. Diaz5Christian M. Schmidt6Hassan R. Mir7Morsani College of Medicine, University of South Florida, Tampa, Florida, USAMorsani College of Medicine, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USAOrthopaedic Trauma Service, Florida Orthopedic Institute, Tampa, Florida, USAAims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability. Methods: We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, “Please explain so it is easier to understand,” to evaluate ChatGPT’s ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a “yes” or “no” question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered “yes.” Results: The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ2 = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85). Conclusion: ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement. Cite this article: Bone Jt Open 2024;5(2):139–146.https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1chatgpttotal hip arthroplastytotal knee arthroplastypatient questionsaccuracyreadabilitytotal hip and knee arthroplastytotal knee arthroplasty (tka)knee arthroplastyhipresident physiciansbone tumoursphysicianspaediatric orthopaedicsdistal radius fracturest-tests
spellingShingle Benjamin M. Wright
Michael S. Bodnar
Andrew D. Moore
Meghan C. Maseda
Michael P. Kucharik
Connor C. Diaz
Christian M. Schmidt
Hassan R. Mir
Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
Bone & Joint Open
chatgpt
total hip arthroplasty
total knee arthroplasty
patient questions
accuracy
readability
total hip and knee arthroplasty
total knee arthroplasty (tka)
knee arthroplasty
hip
resident physicians
bone tumours
physicians
paediatric orthopaedics
distal radius fractures
t-tests
title Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
title_full Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
title_fullStr Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
title_full_unstemmed Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
title_short Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
title_sort is chatgpt a trusted source of information for total hip and knee arthroplasty patients
topic chatgpt
total hip arthroplasty
total knee arthroplasty
patient questions
accuracy
readability
total hip and knee arthroplasty
total knee arthroplasty (tka)
knee arthroplasty
hip
resident physicians
bone tumours
physicians
paediatric orthopaedics
distal radius fractures
t-tests
url https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1
work_keys_str_mv AT benjaminmwright ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT michaelsbodnar ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT andrewdmoore ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT meghancmaseda ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT michaelpkucharik ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT connorcdiaz ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT christianmschmidt ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients
AT hassanrmir ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients