A dataset of simulated patient-physician medical interviews with a focus on respiratory cases
Abstract Artificial Intelligence (AI) is playing a major role in medical education, diagnosis, and outbreak detection through Natural Language Processing (NLP), machine learning models and deep learning tools. However, in order to train AI to facilitate these medical fields, well-documented and accu...
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2022-06-01
|
Series: | Scientific Data |
Online Access: | https://doi.org/10.1038/s41597-022-01423-1 |
_version_ | 1797818206070505472 |
---|---|
author | Faiha Fareez Tishya Parikh Christopher Wavell Saba Shahab Meghan Chevalier Scott Good Isabella De Blasi Rafik Rhouma Christopher McMahon Jean-Paul Lam Thomas Lo Christopher W. Smith |
author_facet | Faiha Fareez Tishya Parikh Christopher Wavell Saba Shahab Meghan Chevalier Scott Good Isabella De Blasi Rafik Rhouma Christopher McMahon Jean-Paul Lam Thomas Lo Christopher W. Smith |
author_sort | Faiha Fareez |
collection | DOAJ |
description | Abstract Artificial Intelligence (AI) is playing a major role in medical education, diagnosis, and outbreak detection through Natural Language Processing (NLP), machine learning models and deep learning tools. However, in order to train AI to facilitate these medical fields, well-documented and accurate medical conversations are needed. The dataset presented covers a series of medical conversations in the format of Objective Structured Clinical Examinations (OSCE), with a focus on respiratory cases in audio format and corresponding text documents. These cases were simulated, recorded, transcribed, and manually corrected with the underlying aim of providing a comprehensive set of medical conversation data to the academic and industry community. Potential applications include speech recognition detection for speech-to-text errors, training NLP models to extract symptoms, detecting diseases, or for educational purposes, including training an avatar to converse with healthcare professional students as a standardized patient during clinical examinations. The application opportunities for the presented dataset are vast, given that this calibre of data is difficult to access and costly to develop. |
first_indexed | 2024-03-13T09:04:43Z |
format | Article |
id | doaj.art-56d168ddbcbc47e183a66e82e71fa6c5 |
institution | Directory Open Access Journal |
issn | 2052-4463 |
language | English |
last_indexed | 2024-03-13T09:04:43Z |
publishDate | 2022-06-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Data |
spelling | doaj.art-56d168ddbcbc47e183a66e82e71fa6c52023-05-28T11:08:17ZengNature PortfolioScientific Data2052-44632022-06-01911710.1038/s41597-022-01423-1A dataset of simulated patient-physician medical interviews with a focus on respiratory casesFaiha Fareez0Tishya Parikh1Christopher Wavell2Saba Shahab3Meghan Chevalier4Scott Good5Isabella De Blasi6Rafik Rhouma7Christopher McMahon8Jean-Paul Lam9Thomas Lo10Christopher W. Smith11Western UniversityWestern UniversityWestern UniversityWestern UniversityWestern UniversityWestern UniversityWestern UniversityGoodlabs StudioGoodlabs StudioGoodlabs StudioGoodlabs StudioWestern UniversityAbstract Artificial Intelligence (AI) is playing a major role in medical education, diagnosis, and outbreak detection through Natural Language Processing (NLP), machine learning models and deep learning tools. However, in order to train AI to facilitate these medical fields, well-documented and accurate medical conversations are needed. The dataset presented covers a series of medical conversations in the format of Objective Structured Clinical Examinations (OSCE), with a focus on respiratory cases in audio format and corresponding text documents. These cases were simulated, recorded, transcribed, and manually corrected with the underlying aim of providing a comprehensive set of medical conversation data to the academic and industry community. Potential applications include speech recognition detection for speech-to-text errors, training NLP models to extract symptoms, detecting diseases, or for educational purposes, including training an avatar to converse with healthcare professional students as a standardized patient during clinical examinations. The application opportunities for the presented dataset are vast, given that this calibre of data is difficult to access and costly to develop.https://doi.org/10.1038/s41597-022-01423-1 |
spellingShingle | Faiha Fareez Tishya Parikh Christopher Wavell Saba Shahab Meghan Chevalier Scott Good Isabella De Blasi Rafik Rhouma Christopher McMahon Jean-Paul Lam Thomas Lo Christopher W. Smith A dataset of simulated patient-physician medical interviews with a focus on respiratory cases Scientific Data |
title | A dataset of simulated patient-physician medical interviews with a focus on respiratory cases |
title_full | A dataset of simulated patient-physician medical interviews with a focus on respiratory cases |
title_fullStr | A dataset of simulated patient-physician medical interviews with a focus on respiratory cases |
title_full_unstemmed | A dataset of simulated patient-physician medical interviews with a focus on respiratory cases |
title_short | A dataset of simulated patient-physician medical interviews with a focus on respiratory cases |
title_sort | dataset of simulated patient physician medical interviews with a focus on respiratory cases |
url | https://doi.org/10.1038/s41597-022-01423-1 |
work_keys_str_mv | AT faihafareez adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT tishyaparikh adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christopherwavell adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT sabashahab adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT meghanchevalier adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT scottgood adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT isabelladeblasi adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT rafikrhouma adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christophermcmahon adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT jeanpaullam adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT thomaslo adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christopherwsmith adatasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT faihafareez datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT tishyaparikh datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christopherwavell datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT sabashahab datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT meghanchevalier datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT scottgood datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT isabelladeblasi datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT rafikrhouma datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christophermcmahon datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT jeanpaullam datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT thomaslo datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases AT christopherwsmith datasetofsimulatedpatientphysicianmedicalinterviewswithafocusonrespiratorycases |