Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study

BackgroundPatient portals tethered to electronic health records systems have become attractive web platforms since the enacting of the Medicare Access and Children’s Health Insurance Program Reauthorization Act and the introduction of the Meaningful Use program in the United...

Full description

Bibliographic Details
Main Authors: Amrita De, Ming Huang, Tinghao Feng, Xiaomeng Yue, Lixia Yao
Format: Article
Language:English
Published: JMIR Publications 2021-07-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2021/7/e26770
_version_ 1797735782774996992
author Amrita De
Ming Huang
Tinghao Feng
Xiaomeng Yue
Lixia Yao
author_facet Amrita De
Ming Huang
Tinghao Feng
Xiaomeng Yue
Lixia Yao
author_sort Amrita De
collection DOAJ
description BackgroundPatient portals tethered to electronic health records systems have become attractive web platforms since the enacting of the Medicare Access and Children’s Health Insurance Program Reauthorization Act and the introduction of the Meaningful Use program in the United States. Patients can conveniently access their health records and seek consultation from providers through secure web portals. With increasing adoption and patient engagement, the volume of patient secure messages has risen substantially, which opens up new research and development opportunities for patient-centered care. ObjectiveThis study aims to develop a data model for patient secure messages based on the Fast Healthcare Interoperability Resources (FHIR) standard to identify and extract significant information. MethodsWe initiated the first draft of the data model by analyzing FHIR and manually reviewing 100 sentences randomly sampled from more than 2 million patient-generated secure messages obtained from the online patient portal at the Mayo Clinic Rochester between February 18, 2010, and December 31, 2017. We then annotated additional sets of 100 randomly selected sentences using the Multi-purpose Annotation Environment tool and updated the data model and annotation guideline iteratively until the interannotator agreement was satisfactory. We then created a larger corpus by annotating 1200 randomly selected sentences and calculated the frequency of the identified medical concepts in these sentences. Finally, we performed topic modeling analysis to learn the hidden topics of patient secure messages related to 3 highly mentioned microconcepts, namely, fatigue, prednisone, and patient visit, and to evaluate the proposed data model independently. ResultsThe proposed data model has a 3-level hierarchical structure of health system concepts, including 3 macroconcepts, 28 mesoconcepts, and 85 microconcepts. Foundation and base macroconcepts comprise 33.99% (841/2474), clinical macroconcepts comprise 64.38% (1593/2474), and financial macroconcepts comprise 1.61% (40/2474) of the annotated corpus. The top 3 mesoconcepts among the 28 mesoconcepts are condition (505/2474, 20.41%), medication (424/2474, 17.13%), and practitioner (243/2474, 9.82%). Topic modeling identified hidden topics of patient secure messages related to fatigue, prednisone, and patient visit. A total of 89.2% (107/120) of the top-ranked topic keywords are actually the health concepts of the data model. ConclusionsOur data model and annotated corpus enable us to identify and understand important medical concepts in patient secure messages and prepare us for further natural language processing analysis of such free texts. The data model could be potentially used to automatically identify other types of patient narratives, such as those in various social media and patient forums. In the future, we plan to develop a machine learning and natural language processing solution to enable automatic triaging solutions to reduce the workload of clinicians and perform more granular content analysis to understand patients’ needs and improve patient-centered care.
first_indexed 2024-03-12T13:04:05Z
format Article
id doaj.art-1b5081bc543142a28eb67fabe593d269
institution Directory Open Access Journal
issn 1438-8871
language English
last_indexed 2024-03-12T13:04:05Z
publishDate 2021-07-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj.art-1b5081bc543142a28eb67fabe593d2692023-08-28T17:10:59ZengJMIR PublicationsJournal of Medical Internet Research1438-88712021-07-01237e2677010.2196/26770Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling StudyAmrita Dehttps://orcid.org/0000-0002-5704-574XMing Huanghttps://orcid.org/0000-0001-7367-3626Tinghao Fenghttps://orcid.org/0000-0003-2765-2765Xiaomeng Yuehttps://orcid.org/0000-0002-4418-7079Lixia Yaohttps://orcid.org/0000-0002-5187-6120 BackgroundPatient portals tethered to electronic health records systems have become attractive web platforms since the enacting of the Medicare Access and Children’s Health Insurance Program Reauthorization Act and the introduction of the Meaningful Use program in the United States. Patients can conveniently access their health records and seek consultation from providers through secure web portals. With increasing adoption and patient engagement, the volume of patient secure messages has risen substantially, which opens up new research and development opportunities for patient-centered care. ObjectiveThis study aims to develop a data model for patient secure messages based on the Fast Healthcare Interoperability Resources (FHIR) standard to identify and extract significant information. MethodsWe initiated the first draft of the data model by analyzing FHIR and manually reviewing 100 sentences randomly sampled from more than 2 million patient-generated secure messages obtained from the online patient portal at the Mayo Clinic Rochester between February 18, 2010, and December 31, 2017. We then annotated additional sets of 100 randomly selected sentences using the Multi-purpose Annotation Environment tool and updated the data model and annotation guideline iteratively until the interannotator agreement was satisfactory. We then created a larger corpus by annotating 1200 randomly selected sentences and calculated the frequency of the identified medical concepts in these sentences. Finally, we performed topic modeling analysis to learn the hidden topics of patient secure messages related to 3 highly mentioned microconcepts, namely, fatigue, prednisone, and patient visit, and to evaluate the proposed data model independently. ResultsThe proposed data model has a 3-level hierarchical structure of health system concepts, including 3 macroconcepts, 28 mesoconcepts, and 85 microconcepts. Foundation and base macroconcepts comprise 33.99% (841/2474), clinical macroconcepts comprise 64.38% (1593/2474), and financial macroconcepts comprise 1.61% (40/2474) of the annotated corpus. The top 3 mesoconcepts among the 28 mesoconcepts are condition (505/2474, 20.41%), medication (424/2474, 17.13%), and practitioner (243/2474, 9.82%). Topic modeling identified hidden topics of patient secure messages related to fatigue, prednisone, and patient visit. A total of 89.2% (107/120) of the top-ranked topic keywords are actually the health concepts of the data model. ConclusionsOur data model and annotated corpus enable us to identify and understand important medical concepts in patient secure messages and prepare us for further natural language processing analysis of such free texts. The data model could be potentially used to automatically identify other types of patient narratives, such as those in various social media and patient forums. In the future, we plan to develop a machine learning and natural language processing solution to enable automatic triaging solutions to reduce the workload of clinicians and perform more granular content analysis to understand patients’ needs and improve patient-centered care.https://www.jmir.org/2021/7/e26770
spellingShingle Amrita De
Ming Huang
Tinghao Feng
Xiaomeng Yue
Lixia Yao
Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
Journal of Medical Internet Research
title Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
title_full Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
title_fullStr Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
title_full_unstemmed Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
title_short Analyzing Patient Secure Messages Using a Fast Health Care Interoperability Resources (FIHR)–Based Data Model: Development and Topic Modeling Study
title_sort analyzing patient secure messages using a fast health care interoperability resources fihr based data model development and topic modeling study
url https://www.jmir.org/2021/7/e26770
work_keys_str_mv AT amritade analyzingpatientsecuremessagesusingafasthealthcareinteroperabilityresourcesfihrbaseddatamodeldevelopmentandtopicmodelingstudy
AT minghuang analyzingpatientsecuremessagesusingafasthealthcareinteroperabilityresourcesfihrbaseddatamodeldevelopmentandtopicmodelingstudy
AT tinghaofeng analyzingpatientsecuremessagesusingafasthealthcareinteroperabilityresourcesfihrbaseddatamodeldevelopmentandtopicmodelingstudy
AT xiaomengyue analyzingpatientsecuremessagesusingafasthealthcareinteroperabilityresourcesfihrbaseddatamodeldevelopmentandtopicmodelingstudy
AT lixiayao analyzingpatientsecuremessagesusingafasthealthcareinteroperabilityresourcesfihrbaseddatamodeldevelopmentandtopicmodelingstudy