Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin

Abstract Background Medical decision support systems (CDSSs) are increasingly used in medicine, but their utility in daily medical practice is difficult to evaluate. One variant of CDSS is a generator of differential diagnoses (DDx generator). We performed a feasibility study on three different, pub...

Full description

Bibliographic Details
Main Authors: P. Fritz, A. Kleinhans, R. Raoufi, A. Sediqi, N. Schmid, S. Schricker, M. Schanz, C. Fritz-Kuisle, P. Dalquen, H. Firooz, G. Stauch, M. D. Alscher
Format: Article
Language:English
Published: BMC 2022-09-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-022-01988-2
_version_ 1811208815518941184
author P. Fritz
A. Kleinhans
R. Raoufi
A. Sediqi
N. Schmid
S. Schricker
M. Schanz
C. Fritz-Kuisle
P. Dalquen
H. Firooz
G. Stauch
M. D. Alscher
author_facet P. Fritz
A. Kleinhans
R. Raoufi
A. Sediqi
N. Schmid
S. Schricker
M. Schanz
C. Fritz-Kuisle
P. Dalquen
H. Firooz
G. Stauch
M. D. Alscher
author_sort P. Fritz
collection DOAJ
description Abstract Background Medical decision support systems (CDSSs) are increasingly used in medicine, but their utility in daily medical practice is difficult to evaluate. One variant of CDSS is a generator of differential diagnoses (DDx generator). We performed a feasibility study on three different, publicly available data sets of medical cases in order to identify the frequency in which two different DDx generators provide helpful information (either by providing a list of differential diagnosis or recognizing the expert diagnosis if available) for a given case report. Methods Used data sets were n = 105 cases from a web-based forum of telemedicine with real life cases from Afghanistan (Afghan data set; AD), n = 124 cases discussed in a web-based medical forum (Coliquio data set; CD). Both websites are restricted for medical professionals only. The third data set consisted 50 special case reports published in the New England Journal of Medicine (NEJM). After keyword extraction, data were entered into two different DDx generators (IsabelHealth (IH), Memem7 (M7)) to examine differences in target diagnosis recognition and physician-rated usefulness between DDx generators. Results Both DDx generators detected the target diagnosis equally successfully (all cases: M7, 83/170 (49%); IH 90/170 (53%), NEJM: M7, 28/50 (56%); IH, 34/50 (68%); differences n.s.). Differences occurred in AD, where detection of an expert diagnosis was less successful with IH than with M7 (29.7% vs. 54.1%, p = 0.003). In contrast, in CD IH performed significantly better than M7 (73.9% vs. 32.6%, p = 0.021). Congruent identification of target diagnosis occurred in only 46/170 (27.1%) of cases. However, a qualitative analysis of the DDx results revealed useful complements from using the two systems in parallel. Conclusion Both DDx systems IsabelHealth and Memem7 provided substantial help in finding a helpful list of differential diagnoses or identifying the target diagnosis either in standard cases or complicated and rare cases. Our pilot study highlights the need for different levels of complexity and types of real-world medical test cases, as there are significant differences between DDx generators away from traditional case reports. Combining different results from DDx generators seems to be a possible approach for future review and use of the systems.
first_indexed 2024-04-12T04:27:55Z
format Article
id doaj.art-9d4e9f15e3824bfe9d096d2ec08ef641
institution Directory Open Access Journal
issn 1472-6947
language English
last_indexed 2024-04-12T04:27:55Z
publishDate 2022-09-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj.art-9d4e9f15e3824bfe9d096d2ec08ef6412022-12-22T03:48:01ZengBMCBMC Medical Informatics and Decision Making1472-69472022-09-012211910.1186/s12911-022-01988-2Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and originP. Fritz0A. Kleinhans1R. Raoufi2A. Sediqi3N. Schmid4S. Schricker5M. Schanz6C. Fritz-Kuisle7P. Dalquen8H. Firooz9G. Stauch10M. D. Alscher11Department of Pathology, Robert-Bosch-HospitalIPath Telemedicine Network Gemeinnützige GmbHAbu Ali Sina HospitalAbu Ali Sina HospitalRobert Bosch Gesellschaft Für Medizinische Forschung mbHDepartment of Internal Medicine and Nephrology, Robert-Bosch-HospitalDepartment of Internal Medicine and Nephrology, Robert-Bosch-HospitalDepartment of Anesthesia, Kreiskrankenhaus GünzburgInstitute of Pathology University BaselFirooz Medical LaboratoryIPath Telemedicine Network Gemeinnützige GmbHRobert-Bosch-HospitalAbstract Background Medical decision support systems (CDSSs) are increasingly used in medicine, but their utility in daily medical practice is difficult to evaluate. One variant of CDSS is a generator of differential diagnoses (DDx generator). We performed a feasibility study on three different, publicly available data sets of medical cases in order to identify the frequency in which two different DDx generators provide helpful information (either by providing a list of differential diagnosis or recognizing the expert diagnosis if available) for a given case report. Methods Used data sets were n = 105 cases from a web-based forum of telemedicine with real life cases from Afghanistan (Afghan data set; AD), n = 124 cases discussed in a web-based medical forum (Coliquio data set; CD). Both websites are restricted for medical professionals only. The third data set consisted 50 special case reports published in the New England Journal of Medicine (NEJM). After keyword extraction, data were entered into two different DDx generators (IsabelHealth (IH), Memem7 (M7)) to examine differences in target diagnosis recognition and physician-rated usefulness between DDx generators. Results Both DDx generators detected the target diagnosis equally successfully (all cases: M7, 83/170 (49%); IH 90/170 (53%), NEJM: M7, 28/50 (56%); IH, 34/50 (68%); differences n.s.). Differences occurred in AD, where detection of an expert diagnosis was less successful with IH than with M7 (29.7% vs. 54.1%, p = 0.003). In contrast, in CD IH performed significantly better than M7 (73.9% vs. 32.6%, p = 0.021). Congruent identification of target diagnosis occurred in only 46/170 (27.1%) of cases. However, a qualitative analysis of the DDx results revealed useful complements from using the two systems in parallel. Conclusion Both DDx systems IsabelHealth and Memem7 provided substantial help in finding a helpful list of differential diagnoses or identifying the target diagnosis either in standard cases or complicated and rare cases. Our pilot study highlights the need for different levels of complexity and types of real-world medical test cases, as there are significant differences between DDx generators away from traditional case reports. Combining different results from DDx generators seems to be a possible approach for future review and use of the systems.https://doi.org/10.1186/s12911-022-01988-2Medical decision support systems (MDSS)TelemedicineSecond opinionDiagnosis assistance systemsCDSSDDx generator
spellingShingle P. Fritz
A. Kleinhans
R. Raoufi
A. Sediqi
N. Schmid
S. Schricker
M. Schanz
C. Fritz-Kuisle
P. Dalquen
H. Firooz
G. Stauch
M. D. Alscher
Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
BMC Medical Informatics and Decision Making
Medical decision support systems (MDSS)
Telemedicine
Second opinion
Diagnosis assistance systems
CDSS
DDx generator
title Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
title_full Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
title_fullStr Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
title_full_unstemmed Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
title_short Evaluation of medical decision support systems (DDX generators) using real medical cases of varying complexity and origin
title_sort evaluation of medical decision support systems ddx generators using real medical cases of varying complexity and origin
topic Medical decision support systems (MDSS)
Telemedicine
Second opinion
Diagnosis assistance systems
CDSS
DDx generator
url https://doi.org/10.1186/s12911-022-01988-2
work_keys_str_mv AT pfritz evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT akleinhans evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT rraoufi evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT asediqi evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT nschmid evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT sschricker evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT mschanz evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT cfritzkuisle evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT pdalquen evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT hfirooz evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT gstauch evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin
AT mdalscher evaluationofmedicaldecisionsupportsystemsddxgeneratorsusingrealmedicalcasesofvaryingcomplexityandorigin