Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition

Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as...

Full description

Bibliographic Details
Main Authors: Qiang He, Guowei Chen, Wenchao Song, Pengzhou Zhang
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/5/3331
_version_ 1797615635611516928
author Qiang He
Guowei Chen
Wenchao Song
Pengzhou Zhang
author_facet Qiang He
Guowei Chen
Wenchao Song
Pengzhou Zhang
author_sort Qiang He
collection DOAJ
description Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as it helps to organize the relevant information in a structured way. The current approaches to Chinese named entity recognition do not consider the category information of matched Chinese words, which limits their ability to capture the correlation between words. This makes Chinese NER more challenging than English NER, which already has well-defined word boundaries. To improve Chinese NER, it is necessary to develop new approaches that take into account category features of matched Chinese words, and the category information would help to effectively capture the relationship between words. This paper proposes a Prompt-based Word-level Information Injection BERT (PWII-BERT) to integrate prompt-guided lexicon information into a pre-trained language model. Specifically, we engineer a Word-level Information Injection Adapter (WIIA) through the original Transformer encoder and prompt-guided Transformer layers. Thus, the ability of PWII-BERT to explicitly obtain fine-grained character-to-word relevant information according to the category prompt is one of its key advantages. In experiments on four benchmark datasets, PWII-BERT outperforms the baselines, demonstrating the significance of fully utilizing the advantages of fusing the category information and lexicon feature to implement Chinese NER.
first_indexed 2024-03-11T07:29:29Z
format Article
id doaj.art-246111b6c3754d0aacfe48a4ec3bde3c
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T07:29:29Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-246111b6c3754d0aacfe48a4ec3bde3c2023-11-17T07:22:25ZengMDPI AGApplied Sciences2076-34172023-03-01135333110.3390/app13053331Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity RecognitionQiang He0Guowei Chen1Wenchao Song2Pengzhou Zhang3State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaNamed entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as it helps to organize the relevant information in a structured way. The current approaches to Chinese named entity recognition do not consider the category information of matched Chinese words, which limits their ability to capture the correlation between words. This makes Chinese NER more challenging than English NER, which already has well-defined word boundaries. To improve Chinese NER, it is necessary to develop new approaches that take into account category features of matched Chinese words, and the category information would help to effectively capture the relationship between words. This paper proposes a Prompt-based Word-level Information Injection BERT (PWII-BERT) to integrate prompt-guided lexicon information into a pre-trained language model. Specifically, we engineer a Word-level Information Injection Adapter (WIIA) through the original Transformer encoder and prompt-guided Transformer layers. Thus, the ability of PWII-BERT to explicitly obtain fine-grained character-to-word relevant information according to the category prompt is one of its key advantages. In experiments on four benchmark datasets, PWII-BERT outperforms the baselines, demonstrating the significance of fully utilizing the advantages of fusing the category information and lexicon feature to implement Chinese NER.https://www.mdpi.com/2076-3417/13/5/3331named entity recognitionprompt learningpre-trained language modelBERT adapter
spellingShingle Qiang He
Guowei Chen
Wenchao Song
Pengzhou Zhang
Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
Applied Sciences
named entity recognition
prompt learning
pre-trained language model
BERT adapter
title Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
title_full Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
title_fullStr Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
title_full_unstemmed Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
title_short Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
title_sort prompt based word level information injection bert for chinese named entity recognition
topic named entity recognition
prompt learning
pre-trained language model
BERT adapter
url https://www.mdpi.com/2076-3417/13/5/3331
work_keys_str_mv AT qianghe promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition
AT guoweichen promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition
AT wenchaosong promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition
AT pengzhouzhang promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition