Stacking-BERT model for Chinese medical procedure entity normalization

Medical procedure entity normalization is an important task to realize medical information sharing at the semantic level; it faces main challenges such as variety and similarity in real-world practice. Although deep learning-based methods have been successfully applied to biomedical entity normaliza...

Full description

Bibliographic Details
Main Authors:	Luqi Li, Yunkai Zhai, Jinghong Gao, Linlin Wang, Li Hou, Jie Zhao
Format:	Article
Language:	English
Published:	AIMS Press 2023-01-01
Series:	Mathematical Biosciences and Engineering
Subjects:	chinese medical procedure entity normalization bert siamese-bert stacking adversarial training
Online Access:	https://www.aimspress.com/article/doi/10.3934/mbe.2023047?viewType=HTML

_version_	1828134324437778432
author	Luqi Li Yunkai Zhai Jinghong Gao Linlin Wang Li Hou Jie Zhao
author_facet	Luqi Li Yunkai Zhai Jinghong Gao Linlin Wang Li Hou Jie Zhao
author_sort	Luqi Li
collection	DOAJ
description	Medical procedure entity normalization is an important task to realize medical information sharing at the semantic level; it faces main challenges such as variety and similarity in real-world practice. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings, and there is minimal research on medical entity recognition in Chinese Regarding the entity normalization task as a sentence pair classification task, we applied a three-step framework to normalize Chinese medical procedure terms, and it consists of dataset construction, candidate concept generation and candidate concept ranking. For dataset construction, external knowledge base and easy data augmentation skills were used to increase the diversity of training samples. For candidate concept generation, we implemented the BM25 retrieval method based on integrating synonym knowledge of SNOMED CT and train data. For candidate concept ranking, we designed a stacking-BERT model, including the original BERT-based and Siamese-BERT ranking models, to capture the semantic information and choose the optimal mapping pairs by the stacking mechanism. In the training process, we also added the tricks of adversarial training to improve the learning ability of the model on small-scale training data. Based on the clinical entity normalization task dataset of the 5th China Health Information Processing Conference, our stacking-BERT model achieved an accuracy of 93.1%, which outperformed the single BERT models and other traditional deep learning models. In conclusion, this paper presents an effective method for Chinese medical procedure entity normalization and validation of different BERT-based models. In addition, we found that the tricks of adversarial training and data augmentation can effectively improve the effect of the deep learning model for small samples, which might provide some useful ideas for future research.
first_indexed	2024-04-11T17:32:29Z
format	Article
id	doaj.art-9c86aa039ce84e8490d56e074d67639f
institution	Directory Open Access Journal
issn	1551-0018
language	English
last_indexed	2024-04-11T17:32:29Z
publishDate	2023-01-01
publisher	AIMS Press
record_format	Article
series	Mathematical Biosciences and Engineering
spelling	doaj.art-9c86aa039ce84e8490d56e074d67639f2022-12-22T04:11:58ZengAIMS PressMathematical Biosciences and Engineering1551-00182023-01-012011018103610.3934/mbe.2023047Stacking-BERT model for Chinese medical procedure entity normalizationLuqi Li0Yunkai Zhai1Jinghong Gao2Linlin Wang3Li Hou4Jie Zhao51. Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China2. National Engineering Laboratory for Internet Medical Systems and Applications, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China2. National Engineering Laboratory for Internet Medical Systems and Applications, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China2. National Engineering Laboratory for Internet Medical Systems and Applications, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China1. Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China2. National Engineering Laboratory for Internet Medical Systems and Applications, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, ChinaMedical procedure entity normalization is an important task to realize medical information sharing at the semantic level; it faces main challenges such as variety and similarity in real-world practice. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings, and there is minimal research on medical entity recognition in Chinese Regarding the entity normalization task as a sentence pair classification task, we applied a three-step framework to normalize Chinese medical procedure terms, and it consists of dataset construction, candidate concept generation and candidate concept ranking. For dataset construction, external knowledge base and easy data augmentation skills were used to increase the diversity of training samples. For candidate concept generation, we implemented the BM25 retrieval method based on integrating synonym knowledge of SNOMED CT and train data. For candidate concept ranking, we designed a stacking-BERT model, including the original BERT-based and Siamese-BERT ranking models, to capture the semantic information and choose the optimal mapping pairs by the stacking mechanism. In the training process, we also added the tricks of adversarial training to improve the learning ability of the model on small-scale training data. Based on the clinical entity normalization task dataset of the 5th China Health Information Processing Conference, our stacking-BERT model achieved an accuracy of 93.1%, which outperformed the single BERT models and other traditional deep learning models. In conclusion, this paper presents an effective method for Chinese medical procedure entity normalization and validation of different BERT-based models. In addition, we found that the tricks of adversarial training and data augmentation can effectively improve the effect of the deep learning model for small samples, which might provide some useful ideas for future research.https://www.aimspress.com/article/doi/10.3934/mbe.2023047?viewType=HTMLchinese medical procedure entity normalizationbertsiamese-bertstackingadversarial training
spellingShingle	Luqi Li Yunkai Zhai Jinghong Gao Linlin Wang Li Hou Jie Zhao Stacking-BERT model for Chinese medical procedure entity normalization Mathematical Biosciences and Engineering chinese medical procedure entity normalization bert siamese-bert stacking adversarial training
title	Stacking-BERT model for Chinese medical procedure entity normalization
title_full	Stacking-BERT model for Chinese medical procedure entity normalization
title_fullStr	Stacking-BERT model for Chinese medical procedure entity normalization
title_full_unstemmed	Stacking-BERT model for Chinese medical procedure entity normalization
title_short	Stacking-BERT model for Chinese medical procedure entity normalization
title_sort	stacking bert model for chinese medical procedure entity normalization
topic	chinese medical procedure entity normalization bert siamese-bert stacking adversarial training
url	https://www.aimspress.com/article/doi/10.3934/mbe.2023047?viewType=HTML
work_keys_str_mv	AT luqili stackingbertmodelforchinesemedicalprocedureentitynormalization AT yunkaizhai stackingbertmodelforchinesemedicalprocedureentitynormalization AT jinghonggao stackingbertmodelforchinesemedicalprocedureentitynormalization AT linlinwang stackingbertmodelforchinesemedicalprocedureentitynormalization AT lihou stackingbertmodelforchinesemedicalprocedureentitynormalization AT jiezhao stackingbertmodelforchinesemedicalprocedureentitynormalization

Stacking-BERT model for Chinese medical procedure entity normalization

Similar Items