Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation

BackgroundDoctors must care for many patients simultaneously, and it is time-consuming to find and examine all patients’ medical histories. Discharge diagnoses provide hospital staff with sufficient information to enable handling multiple patients; however, the excessive amount of words in the diagn...

Full description

Bibliographic Details
Main Authors: Chen, Yen-Pin, Chen, Yi-Ying, Lin, Jr-Jiun, Huang, Chien-Hua, Lai, Feipei
Format: Article
Language:English
Published: JMIR Publications 2020-04-01
Series:JMIR Medical Informatics
Online Access:http://medinform.jmir.org/2020/4/e17787/
_version_ 1819281825553448960
author Chen, Yen-Pin
Chen, Yi-Ying
Lin, Jr-Jiun
Huang, Chien-Hua
Lai, Feipei
author_facet Chen, Yen-Pin
Chen, Yi-Ying
Lin, Jr-Jiun
Huang, Chien-Hua
Lai, Feipei
author_sort Chen, Yen-Pin
collection DOAJ
description BackgroundDoctors must care for many patients simultaneously, and it is time-consuming to find and examine all patients’ medical histories. Discharge diagnoses provide hospital staff with sufficient information to enable handling multiple patients; however, the excessive amount of words in the diagnostic sentences poses problems. Deep learning may be an effective solution to overcome this problem, but the use of such a heavy model may also add another obstacle to systems with limited computing resources. ObjectiveWe aimed to build a diagnoses-extractive summarization model for hospital information systems and provide a service that can be operated even with limited computing resources. MethodsWe used a Bidirectional Encoder Representations from Transformers (BERT)-based structure with a two-stage training method based on 258,050 discharge diagnoses obtained from the National Taiwan University Hospital Integrated Medical Database, and the highlighted extractive summaries written by experienced doctors were labeled. The model size was reduced using a character-level token, the number of parameters was decreased from 108,523,714 to 963,496, and the model was pretrained using random mask characters in the discharge diagnoses and International Statistical Classification of Diseases and Related Health Problems sets. We then fine-tuned the model using summary labels and cleaned up the prediction results by averaging all probabilities for entire words to prevent character level–induced fragment words. Model performance was evaluated against existing models BERT, BioBERT, and Long Short-Term Memory (LSTM) using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) L score, and a questionnaire website was built to collect feedback from more doctors for each summary proposal. ResultsThe area under the receiver operating characteristic curve values of the summary proposals were 0.928, 0.941, 0.899, and 0.947 for BERT, BioBERT, LSTM, and the proposed model (AlphaBERT), respectively. The ROUGE-L scores were 0.697, 0.711, 0.648, and 0.693 for BERT, BioBERT, LSTM, and AlphaBERT, respectively. The mean (SD) critique scores from doctors were 2.232 (0.832), 2.134 (0.877), 2.207 (0.844), 1.927 (0.910), and 2.126 (0.874) for reference-by-doctor labels, BERT, BioBERT, LSTM, and AlphaBERT, respectively. Based on the paired t test, there was a statistically significant difference in LSTM compared to the reference (P<.001), BERT (P=.001), BioBERT (P<.001), and AlphaBERT (P=.002), but not in the other models. ConclusionsUse of character-level tokens in a BERT model can greatly decrease the model size without significantly reducing performance for diagnoses summarization. A well-developed deep-learning model will enhance doctors’ abilities to manage patients and promote medical studies by providing the capability to use extensive unstructured free-text notes.
first_indexed 2024-12-24T01:05:51Z
format Article
id doaj.art-af8704ff7b5243f1ba93826689fdd920
institution Directory Open Access Journal
issn 2291-9694
language English
last_indexed 2024-12-24T01:05:51Z
publishDate 2020-04-01
publisher JMIR Publications
record_format Article
series JMIR Medical Informatics
spelling doaj.art-af8704ff7b5243f1ba93826689fdd9202022-12-21T17:23:13ZengJMIR PublicationsJMIR Medical Informatics2291-96942020-04-0184e1778710.2196/17787Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance EvaluationChen, Yen-PinChen, Yi-YingLin, Jr-JiunHuang, Chien-HuaLai, FeipeiBackgroundDoctors must care for many patients simultaneously, and it is time-consuming to find and examine all patients’ medical histories. Discharge diagnoses provide hospital staff with sufficient information to enable handling multiple patients; however, the excessive amount of words in the diagnostic sentences poses problems. Deep learning may be an effective solution to overcome this problem, but the use of such a heavy model may also add another obstacle to systems with limited computing resources. ObjectiveWe aimed to build a diagnoses-extractive summarization model for hospital information systems and provide a service that can be operated even with limited computing resources. MethodsWe used a Bidirectional Encoder Representations from Transformers (BERT)-based structure with a two-stage training method based on 258,050 discharge diagnoses obtained from the National Taiwan University Hospital Integrated Medical Database, and the highlighted extractive summaries written by experienced doctors were labeled. The model size was reduced using a character-level token, the number of parameters was decreased from 108,523,714 to 963,496, and the model was pretrained using random mask characters in the discharge diagnoses and International Statistical Classification of Diseases and Related Health Problems sets. We then fine-tuned the model using summary labels and cleaned up the prediction results by averaging all probabilities for entire words to prevent character level–induced fragment words. Model performance was evaluated against existing models BERT, BioBERT, and Long Short-Term Memory (LSTM) using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) L score, and a questionnaire website was built to collect feedback from more doctors for each summary proposal. ResultsThe area under the receiver operating characteristic curve values of the summary proposals were 0.928, 0.941, 0.899, and 0.947 for BERT, BioBERT, LSTM, and the proposed model (AlphaBERT), respectively. The ROUGE-L scores were 0.697, 0.711, 0.648, and 0.693 for BERT, BioBERT, LSTM, and AlphaBERT, respectively. The mean (SD) critique scores from doctors were 2.232 (0.832), 2.134 (0.877), 2.207 (0.844), 1.927 (0.910), and 2.126 (0.874) for reference-by-doctor labels, BERT, BioBERT, LSTM, and AlphaBERT, respectively. Based on the paired t test, there was a statistically significant difference in LSTM compared to the reference (P<.001), BERT (P=.001), BioBERT (P<.001), and AlphaBERT (P=.002), but not in the other models. ConclusionsUse of character-level tokens in a BERT model can greatly decrease the model size without significantly reducing performance for diagnoses summarization. A well-developed deep-learning model will enhance doctors’ abilities to manage patients and promote medical studies by providing the capability to use extensive unstructured free-text notes.http://medinform.jmir.org/2020/4/e17787/
spellingShingle Chen, Yen-Pin
Chen, Yi-Ying
Lin, Jr-Jiun
Huang, Chien-Hua
Lai, Feipei
Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
JMIR Medical Informatics
title Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
title_full Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
title_fullStr Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
title_full_unstemmed Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
title_short Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation
title_sort modified bidirectional encoder representations from transformers extractive summarization model for hospital information systems based on character level tokens alphabert development and performance evaluation
url http://medinform.jmir.org/2020/4/e17787/
work_keys_str_mv AT chenyenpin modifiedbidirectionalencoderrepresentationsfromtransformersextractivesummarizationmodelforhospitalinformationsystemsbasedoncharacterleveltokensalphabertdevelopmentandperformanceevaluation
AT chenyiying modifiedbidirectionalencoderrepresentationsfromtransformersextractivesummarizationmodelforhospitalinformationsystemsbasedoncharacterleveltokensalphabertdevelopmentandperformanceevaluation
AT linjrjiun modifiedbidirectionalencoderrepresentationsfromtransformersextractivesummarizationmodelforhospitalinformationsystemsbasedoncharacterleveltokensalphabertdevelopmentandperformanceevaluation
AT huangchienhua modifiedbidirectionalencoderrepresentationsfromtransformersextractivesummarizationmodelforhospitalinformationsystemsbasedoncharacterleveltokensalphabertdevelopmentandperformanceevaluation
AT laifeipei modifiedbidirectionalencoderrepresentationsfromtransformersextractivesummarizationmodelforhospitalinformationsystemsbasedoncharacterleveltokensalphabertdevelopmentandperformanceevaluation