MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models
CIKM ’24, October 21–25, 2024, Boise, ID, USA
Main Authors: | , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
ACM|Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
2024
|
Online Access: | https://hdl.handle.net/1721.1/157546 |
_version_ | 1824457905192042496 |
---|---|
author | Thao, Phan Nguyen Minh Dao, Cong-Tinh Wu, Chenwei Wang, Jian-Zhe Liu, Shun Ding, Jun-En Restrepo, David Liu, Feng Hung, Fang-Ming Peng, Wen-Chih |
author2 | Massachusetts Institute of Technology. Institute for Medical Engineering & Science |
author_facet | Massachusetts Institute of Technology. Institute for Medical Engineering & Science Thao, Phan Nguyen Minh Dao, Cong-Tinh Wu, Chenwei Wang, Jian-Zhe Liu, Shun Ding, Jun-En Restrepo, David Liu, Feng Hung, Fang-Ming Peng, Wen-Chih |
author_sort | Thao, Phan Nguyen Minh |
collection | MIT |
description | CIKM ’24, October 21–25, 2024, Boise, ID, USA |
first_indexed | 2025-02-19T04:17:25Z |
format | Article |
id | mit-1721.1/157546 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2025-02-19T04:17:25Z |
publishDate | 2024 |
publisher | ACM|Proceedings of the 33rd ACM International Conference on Information and Knowledge Management |
record_format | dspace |
spelling | mit-1721.1/1575462025-02-13T19:45:44Z MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models Thao, Phan Nguyen Minh Dao, Cong-Tinh Wu, Chenwei Wang, Jian-Zhe Liu, Shun Ding, Jun-En Restrepo, David Liu, Feng Hung, Fang-Ming Peng, Wen-Chih Massachusetts Institute of Technology. Institute for Medical Engineering & Science CIKM ’24, October 21–25, 2024, Boise, ID, USA Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task. 2024-11-14T21:33:51Z 2024-11-14T21:33:51Z 2024-10-21 2024-11-01T07:46:35Z Article http://purl.org/eprint/type/ConferencePaper 979-8-4007-0436-9 https://hdl.handle.net/1721.1/157546 Thao, Phan Nguyen Minh, Dao, Cong-Tinh, Wu, Chenwei, Wang, Jian-Zhe, Liu, Shun et al. 2024. "MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models." PUBLISHER_POLICY en https://doi.org/10.1145/3627673.3679962 Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. The author(s) application/pdf ACM|Proceedings of the 33rd ACM International Conference on Information and Knowledge Management Association for Computing Machinery |
spellingShingle | Thao, Phan Nguyen Minh Dao, Cong-Tinh Wu, Chenwei Wang, Jian-Zhe Liu, Shun Ding, Jun-En Restrepo, David Liu, Feng Hung, Fang-Ming Peng, Wen-Chih MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title | MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title_full | MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title_fullStr | MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title_full_unstemmed | MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title_short | MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models |
title_sort | medfuse multimodal ehr data fusion with masked lab test modeling and large language models |
url | https://hdl.handle.net/1721.1/157546 |
work_keys_str_mv | AT thaophannguyenminh medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT daocongtinh medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT wuchenwei medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT wangjianzhe medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT liushun medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT dingjunen medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT restrepodavid medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT liufeng medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT hungfangming medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels AT pengwenchih medfusemultimodalehrdatafusionwithmaskedlabtestmodelingandlargelanguagemodels |