Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?
Aims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions ab...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The British Editorial Society of Bone & Joint Surgery
2024-02-01
|
Series: | Bone & Joint Open |
Subjects: | |
Online Access: | https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1 |
_version_ | 1797289594164609024 |
---|---|
author | Benjamin M. Wright Michael S. Bodnar Andrew D. Moore Meghan C. Maseda Michael P. Kucharik Connor C. Diaz Christian M. Schmidt Hassan R. Mir |
author_facet | Benjamin M. Wright Michael S. Bodnar Andrew D. Moore Meghan C. Maseda Michael P. Kucharik Connor C. Diaz Christian M. Schmidt Hassan R. Mir |
author_sort | Benjamin M. Wright |
collection | DOAJ |
description | Aims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability. Methods: We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, “Please explain so it is easier to understand,” to evaluate ChatGPT’s ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a “yes” or “no” question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered “yes.” Results: The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ2 = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85). Conclusion: ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement. Cite this article: Bone Jt Open 2024;5(2):139–146. |
first_indexed | 2024-03-07T19:07:23Z |
format | Article |
id | doaj.art-5d5d37289639425cbed29e730d8415d4 |
institution | Directory Open Access Journal |
issn | 2633-1462 |
language | English |
last_indexed | 2024-03-07T19:07:23Z |
publishDate | 2024-02-01 |
publisher | The British Editorial Society of Bone & Joint Surgery |
record_format | Article |
series | Bone & Joint Open |
spelling | doaj.art-5d5d37289639425cbed29e730d8415d42024-03-01T06:37:37ZengThe British Editorial Society of Bone & Joint SurgeryBone & Joint Open2633-14622024-02-015213914610.1302/2633-1462.52.BJO-2023-0113.R1Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?Benjamin M. Wright0https://orcid.org/0009-0000-8354-6540Michael S. Bodnar1Andrew D. Moore2Meghan C. Maseda3Michael P. Kucharik4Connor C. Diaz5Christian M. Schmidt6Hassan R. Mir7Morsani College of Medicine, University of South Florida, Tampa, Florida, USAMorsani College of Medicine, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USADepartment of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USAOrthopaedic Trauma Service, Florida Orthopedic Institute, Tampa, Florida, USAAims: While internet search engines have been the primary information source for patients’ questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability. Methods: We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, “Please explain so it is easier to understand,” to evaluate ChatGPT’s ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a “yes” or “no” question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered “yes.” Results: The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ2 = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85). Conclusion: ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement. Cite this article: Bone Jt Open 2024;5(2):139–146.https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1chatgpttotal hip arthroplastytotal knee arthroplastypatient questionsaccuracyreadabilitytotal hip and knee arthroplastytotal knee arthroplasty (tka)knee arthroplastyhipresident physiciansbone tumoursphysicianspaediatric orthopaedicsdistal radius fracturest-tests |
spellingShingle | Benjamin M. Wright Michael S. Bodnar Andrew D. Moore Meghan C. Maseda Michael P. Kucharik Connor C. Diaz Christian M. Schmidt Hassan R. Mir Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? Bone & Joint Open chatgpt total hip arthroplasty total knee arthroplasty patient questions accuracy readability total hip and knee arthroplasty total knee arthroplasty (tka) knee arthroplasty hip resident physicians bone tumours physicians paediatric orthopaedics distal radius fractures t-tests |
title | Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? |
title_full | Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? |
title_fullStr | Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? |
title_full_unstemmed | Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? |
title_short | Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients? |
title_sort | is chatgpt a trusted source of information for total hip and knee arthroplasty patients |
topic | chatgpt total hip arthroplasty total knee arthroplasty patient questions accuracy readability total hip and knee arthroplasty total knee arthroplasty (tka) knee arthroplasty hip resident physicians bone tumours physicians paediatric orthopaedics distal radius fractures t-tests |
url | https://online.boneandjoint.org.uk/doi/epdf/10.1302/2633-1462.52.BJO-2023-0113.R1 |
work_keys_str_mv | AT benjaminmwright ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT michaelsbodnar ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT andrewdmoore ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT meghancmaseda ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT michaelpkucharik ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT connorcdiaz ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT christianmschmidt ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients AT hassanrmir ischatgptatrustedsourceofinformationfortotalhipandkneearthroplastypatients |