Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition
Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-03-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/5/3331 |
_version_ | 1797615635611516928 |
---|---|
author | Qiang He Guowei Chen Wenchao Song Pengzhou Zhang |
author_facet | Qiang He Guowei Chen Wenchao Song Pengzhou Zhang |
author_sort | Qiang He |
collection | DOAJ |
description | Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as it helps to organize the relevant information in a structured way. The current approaches to Chinese named entity recognition do not consider the category information of matched Chinese words, which limits their ability to capture the correlation between words. This makes Chinese NER more challenging than English NER, which already has well-defined word boundaries. To improve Chinese NER, it is necessary to develop new approaches that take into account category features of matched Chinese words, and the category information would help to effectively capture the relationship between words. This paper proposes a Prompt-based Word-level Information Injection BERT (PWII-BERT) to integrate prompt-guided lexicon information into a pre-trained language model. Specifically, we engineer a Word-level Information Injection Adapter (WIIA) through the original Transformer encoder and prompt-guided Transformer layers. Thus, the ability of PWII-BERT to explicitly obtain fine-grained character-to-word relevant information according to the category prompt is one of its key advantages. In experiments on four benchmark datasets, PWII-BERT outperforms the baselines, demonstrating the significance of fully utilizing the advantages of fusing the category information and lexicon feature to implement Chinese NER. |
first_indexed | 2024-03-11T07:29:29Z |
format | Article |
id | doaj.art-246111b6c3754d0aacfe48a4ec3bde3c |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T07:29:29Z |
publishDate | 2023-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-246111b6c3754d0aacfe48a4ec3bde3c2023-11-17T07:22:25ZengMDPI AGApplied Sciences2076-34172023-03-01135333110.3390/app13053331Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity RecognitionQiang He0Guowei Chen1Wenchao Song2Pengzhou Zhang3State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaState Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100803, ChinaNamed entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as it helps to organize the relevant information in a structured way. The current approaches to Chinese named entity recognition do not consider the category information of matched Chinese words, which limits their ability to capture the correlation between words. This makes Chinese NER more challenging than English NER, which already has well-defined word boundaries. To improve Chinese NER, it is necessary to develop new approaches that take into account category features of matched Chinese words, and the category information would help to effectively capture the relationship between words. This paper proposes a Prompt-based Word-level Information Injection BERT (PWII-BERT) to integrate prompt-guided lexicon information into a pre-trained language model. Specifically, we engineer a Word-level Information Injection Adapter (WIIA) through the original Transformer encoder and prompt-guided Transformer layers. Thus, the ability of PWII-BERT to explicitly obtain fine-grained character-to-word relevant information according to the category prompt is one of its key advantages. In experiments on four benchmark datasets, PWII-BERT outperforms the baselines, demonstrating the significance of fully utilizing the advantages of fusing the category information and lexicon feature to implement Chinese NER.https://www.mdpi.com/2076-3417/13/5/3331named entity recognitionprompt learningpre-trained language modelBERT adapter |
spellingShingle | Qiang He Guowei Chen Wenchao Song Pengzhou Zhang Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition Applied Sciences named entity recognition prompt learning pre-trained language model BERT adapter |
title | Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition |
title_full | Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition |
title_fullStr | Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition |
title_full_unstemmed | Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition |
title_short | Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition |
title_sort | prompt based word level information injection bert for chinese named entity recognition |
topic | named entity recognition prompt learning pre-trained language model BERT adapter |
url | https://www.mdpi.com/2076-3417/13/5/3331 |
work_keys_str_mv | AT qianghe promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition AT guoweichen promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition AT wenchaosong promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition AT pengzhouzhang promptbasedwordlevelinformationinjectionbertforchinesenamedentityrecognition |