A self-attention based neural architecture for Chinese medical named entity recognition

The combination of medical field and big data has led to an explosive growth in the volume of electronic medical records (EMRs), in which the information contained has guiding significance for diagnosis. And how to extract these information from EMRs has become a hot research topic. In this paper, w...

Full description

Bibliographic Details
Main Authors: Qian Wan, Jie Liu, Luona Wei, Bin Ji
Format: Article
Language:English
Published: AIMS Press 2020-05-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2020197?viewType=HTML
_version_ 1818461625157419008
author Qian Wan
Jie Liu
Luona Wei
Bin Ji
author_facet Qian Wan
Jie Liu
Luona Wei
Bin Ji
author_sort Qian Wan
collection DOAJ
description The combination of medical field and big data has led to an explosive growth in the volume of electronic medical records (EMRs), in which the information contained has guiding significance for diagnosis. And how to extract these information from EMRs has become a hot research topic. In this paper, we propose an ELMo-ET-CRF model based approach to extract medical named entity from Chinese electronic medical records (CEMRs). Firstly, a domain-specific ELMo model is fine-tuned on a common ELMo model with 4679 raw CEMRs. Then we use the encoder from Transformer (ET) as our model’s encoder to alleviate the long context dependency problem, and the CRF is utilized as the decoder. At last, we compare the BiLSTM-CRF and ET-CRF model with word2vec and ELMo embeddings to CEMRs respectively to validate the effectiveness of ELMo-ET-CRF model. With the same training data and test data, the ELMo-ET-CRF outperforms all the other mentioned model architectures in this paper with 85.59% F1-score, which indicates the effectiveness of the proposed model architecture, and the performance is also competitive on the CCKS2019 leaderboard.
first_indexed 2024-12-14T23:49:07Z
format Article
id doaj.art-4b727f688c7c44329f418fd83d6a1644
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-12-14T23:49:07Z
publishDate 2020-05-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-4b727f688c7c44329f418fd83d6a16442022-12-21T22:43:17ZengAIMS PressMathematical Biosciences and Engineering1551-00182020-05-011743498351110.3934/mbe.2020197A self-attention based neural architecture for Chinese medical named entity recognitionQian Wan0Jie Liu1Luona Wei2Bin Ji31. Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China1. Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China; 2. Laboratory of Software Engineering for Complex Systems, National University of Defense Technology, Changsha 410073, China3. College of Computer, National University of Defense Technology, Changsha 410073, China3. College of Computer, National University of Defense Technology, Changsha 410073, ChinaThe combination of medical field and big data has led to an explosive growth in the volume of electronic medical records (EMRs), in which the information contained has guiding significance for diagnosis. And how to extract these information from EMRs has become a hot research topic. In this paper, we propose an ELMo-ET-CRF model based approach to extract medical named entity from Chinese electronic medical records (CEMRs). Firstly, a domain-specific ELMo model is fine-tuned on a common ELMo model with 4679 raw CEMRs. Then we use the encoder from Transformer (ET) as our model’s encoder to alleviate the long context dependency problem, and the CRF is utilized as the decoder. At last, we compare the BiLSTM-CRF and ET-CRF model with word2vec and ELMo embeddings to CEMRs respectively to validate the effectiveness of ELMo-ET-CRF model. With the same training data and test data, the ELMo-ET-CRF outperforms all the other mentioned model architectures in this paper with 85.59% F1-score, which indicates the effectiveness of the proposed model architecture, and the performance is also competitive on the CCKS2019 leaderboard.https://www.aimspress.com/article/doi/10.3934/mbe.2020197?viewType=HTMLself-attentionelmonamed entity recognitionchinese electronic medical recordsnatural language processing
spellingShingle Qian Wan
Jie Liu
Luona Wei
Bin Ji
A self-attention based neural architecture for Chinese medical named entity recognition
Mathematical Biosciences and Engineering
self-attention
elmo
named entity recognition
chinese electronic medical records
natural language processing
title A self-attention based neural architecture for Chinese medical named entity recognition
title_full A self-attention based neural architecture for Chinese medical named entity recognition
title_fullStr A self-attention based neural architecture for Chinese medical named entity recognition
title_full_unstemmed A self-attention based neural architecture for Chinese medical named entity recognition
title_short A self-attention based neural architecture for Chinese medical named entity recognition
title_sort self attention based neural architecture for chinese medical named entity recognition
topic self-attention
elmo
named entity recognition
chinese electronic medical records
natural language processing
url https://www.aimspress.com/article/doi/10.3934/mbe.2020197?viewType=HTML
work_keys_str_mv AT qianwan aselfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT jieliu aselfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT luonawei aselfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT binji aselfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT qianwan selfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT jieliu selfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT luonawei selfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition
AT binji selfattentionbasedneuralarchitectureforchinesemedicalnamedentityrecognition