Hierarchical shared transfer learning for biomedical named entity recognition

Abstract Background Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data...

Full description

Bibliographic Details
Main Authors:	Zhaoying Chai, Han Jin, Shenghui Shi, Siyan Zhan, Lin Zhuo, Yu Yang
Format:	Article
Language:	English
Published:	BMC 2022-01-01
Series:	BMC Bioinformatics
Subjects:	BioNLP Biomedical named entity recognition Transfer learning Permutation language model Conditional random field
Online Access:	https://doi.org/10.1186/s12859-021-04551-4

_version_	1819229332513488896
author	Zhaoying Chai Han Jin Shenghui Shi Siyan Zhan Lin Zhuo Yu Yang
author_facet	Zhaoying Chai Han Jin Shenghui Shi Siyan Zhan Lin Zhuo Yu Yang
author_sort	Zhaoying Chai
collection	DOAJ
description	Abstract Background Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data-driven context coding ability. However, in BioNER task, deep learning has the problem of poor generalization and instability. Results we propose the hierarchical shared transfer learning, which combines multi-task learning and fine-tuning, and realizes the multi-level information fusion between the underlying entity features and the upper data features. We select 14 datasets containing 4 types of entities for training and evaluate the model. The experimental results showed that the F1-scores of the five gold standard datasets BC5CDR-chemical, BC5CDR-disease, BC2GM, BC4CHEMD, NCBI-disease and LINNAEUS were increased by 0.57, 0.90, 0.42, 0.77, 0.98 and − 2.16 compared to the single-task XLNet-CRF model. BC5CDR-chemical, BC5CDR-disease and BC4CHEMD achieved state-of-the-art results.The reasons why LINNAEUS’s multi-task results are lower than single-task results are discussed at the dataset level. Conclusion Compared with using multi-task learning and fine-tuning alone, the model has more accurate recognition ability of medical entities, and has higher generalization and stability.
first_indexed	2024-12-23T11:11:30Z
format	Article
id	doaj.art-5cc9d4ed5292411c940434f79b0cfc42
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-12-23T11:11:30Z
publishDate	2022-01-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-5cc9d4ed5292411c940434f79b0cfc422022-12-21T17:49:20ZengBMCBMC Bioinformatics1471-21052022-01-0123111410.1186/s12859-021-04551-4Hierarchical shared transfer learning for biomedical named entity recognitionZhaoying Chai0Han Jin1Shenghui Shi2Siyan Zhan3Lin Zhuo4Yu Yang5College of Information Science and Technology, Beijing University of Chemical TechnologyCollege of Information Science and Technology, Beijing University of Chemical TechnologyCollege of Information Science and Technology, Beijing University of Chemical TechnologySchool of Public Health, Peking UniversityResearch Center of Clinical Epidemiology, Peking University Third HospitalNational Institute of Health Data Science, Peking UniversityAbstract Background Biomedical named entity recognition (BioNER) is a basic and important medical information extraction task to extract medical entities with special meaning from medical texts. In recent years, deep learning has become the main research direction of BioNER due to its excellent data-driven context coding ability. However, in BioNER task, deep learning has the problem of poor generalization and instability. Results we propose the hierarchical shared transfer learning, which combines multi-task learning and fine-tuning, and realizes the multi-level information fusion between the underlying entity features and the upper data features. We select 14 datasets containing 4 types of entities for training and evaluate the model. The experimental results showed that the F1-scores of the five gold standard datasets BC5CDR-chemical, BC5CDR-disease, BC2GM, BC4CHEMD, NCBI-disease and LINNAEUS were increased by 0.57, 0.90, 0.42, 0.77, 0.98 and − 2.16 compared to the single-task XLNet-CRF model. BC5CDR-chemical, BC5CDR-disease and BC4CHEMD achieved state-of-the-art results.The reasons why LINNAEUS’s multi-task results are lower than single-task results are discussed at the dataset level. Conclusion Compared with using multi-task learning and fine-tuning alone, the model has more accurate recognition ability of medical entities, and has higher generalization and stability.https://doi.org/10.1186/s12859-021-04551-4BioNLPBiomedical named entity recognitionTransfer learningPermutation language modelConditional random field
spellingShingle	Zhaoying Chai Han Jin Shenghui Shi Siyan Zhan Lin Zhuo Yu Yang Hierarchical shared transfer learning for biomedical named entity recognition BMC Bioinformatics BioNLP Biomedical named entity recognition Transfer learning Permutation language model Conditional random field
title	Hierarchical shared transfer learning for biomedical named entity recognition
title_full	Hierarchical shared transfer learning for biomedical named entity recognition
title_fullStr	Hierarchical shared transfer learning for biomedical named entity recognition
title_full_unstemmed	Hierarchical shared transfer learning for biomedical named entity recognition
title_short	Hierarchical shared transfer learning for biomedical named entity recognition
title_sort	hierarchical shared transfer learning for biomedical named entity recognition
topic	BioNLP Biomedical named entity recognition Transfer learning Permutation language model Conditional random field
url	https://doi.org/10.1186/s12859-021-04551-4
work_keys_str_mv	AT zhaoyingchai hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition AT hanjin hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition AT shenghuishi hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition AT siyanzhan hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition AT linzhuo hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition AT yuyang hierarchicalsharedtransferlearningforbiomedicalnamedentityrecognition

Hierarchical shared transfer learning for biomedical named entity recognition

Similar Items