Research on Chinese Nested Entity Recognition Based on IDCNNLR and GlobalPointer

The task of named entity recognition (NER) is to identify entities in the text and predict their categories. In real-life scenarios, the context of the text is often complex, and there may exist nested entities within an entity. This kind of entity is called a nested entity, and the task of recogniz...

Full description

Bibliographic Details
Main Authors: Weijun Li, Jintong Liu, Yuxiao Gao, Xinyong Zhang, Jianlai Gu
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Applied System Innovation
Subjects:
Online Access:https://www.mdpi.com/2571-5577/7/1/8
Description
Summary:The task of named entity recognition (NER) is to identify entities in the text and predict their categories. In real-life scenarios, the context of the text is often complex, and there may exist nested entities within an entity. This kind of entity is called a nested entity, and the task of recognizing entities with nested structures is referred to as nested named entity recognition. Most existing NER models can only handle flat entities, and there has been limited research progress in Chinese nested named entity recognition, resulting in relatively few models in this direction. General NER models have limited semantic extraction capabilities and cannot capture deep semantic information between nested entities in the text. To address these issues, this paper proposes a model that uses the GlobalPointer module to identify nested entities in the text and constructs the IDCNNLR semantic extraction module to extract deep semantic information. Furthermore, multiple-head self-attention mechanisms are incorporated into the model at multiple positions to achieve data denoising, enhancing the quality of semantic features. The proposed model considers each possible entity boundary through the GlobalPointer module, and the IDCNNLR semantic extraction module and multi-position attention mechanism are introduced to enhance the model’s semantic extraction capability. Experimental results demonstrate that the proposed model achieves <i>F1</i> scores of 69.617% and 79.285% on the CMeEE Chinese nested entity recognition dataset and CLUENER2020 Chinese fine-grained entity recognition dataset, respectively. The model exhibits improvement compared to baseline models, and each innovation point shows effective performance enhancement in ablative experiments.
ISSN:2571-5577