Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition
Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of char...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-09-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/19/9038 |
_version_ | 1827680934726467584 |
---|---|
author | Wazir Ali Jay Kumar Zenglin Xu Rajesh Kumar Yazhou Ren |
author_facet | Wazir Ali Jay Kumar Zenglin Xu Rajesh Kumar Yazhou Ren |
author_sort | Wazir Ali |
collection | DOAJ |
description | Named entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries. |
first_indexed | 2024-03-10T07:06:19Z |
format | Article |
id | doaj.art-00e6f5d30bcc45c8a92819b1c01680a9 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T07:06:19Z |
publishDate | 2021-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-00e6f5d30bcc45c8a92819b1c01680a92023-11-22T15:46:45ZengMDPI AGApplied Sciences2076-34172021-09-011119903810.3390/app11199038Context-Aware Bidirectional Neural Model for Sindhi Named Entity RecognitionWazir Ali0Jay Kumar1Zenglin Xu2Rajesh Kumar3Yazhou Ren4School of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611713, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611713, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611713, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611713, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu 611713, ChinaNamed entity recognition (NER) is a fundamental task in many natural language processing (NLP) applications, such as text summarization and semantic information retrieval. Recently, deep neural networks (NNs) with the attention mechanism yield excellent performance in NER by taking advantage of character-level and word-level representation learning. In this paper, we propose a deep context-aware bidirectional long short-term memory (CaBiLSTM) model for the Sindhi NER task. The model relies upon contextual representation learning (CRL), bidirectional encoder, self-attention, and sequential conditional random field (CRF). The CaBiLSTM model incorporates task-oriented CRL based on joint character-level and word-level representations. It takes character-level input to learn the character representations. Afterwards, the character representations are transformed into word features, and the bidirectional encoder learns the word representations. The output of the final encoder is fed into the self-attention through a hidden layer before decoding. Finally, we employ the CRF for the prediction of label sequences. The baselines and the proposed CaBiLSTM model are compared by exploiting pretrained Sindhi GloVe (SdGloVe), Sindhi fastText (SdfastText), task-oriented, and CRL-based word representations on the recently proposed SiNER dataset. Our proposed CaBiLSTM model achieved a high F1-score of 91.25% on the SiNER dataset with CRL without relying on additional handmade features, such as hand-crafted rules, gazetteers, or dictionaries.https://www.mdpi.com/2076-3417/11/19/9038Sindhi named entity recognitionrecurrent neural networksCaBiLSTMself-attention mechanismcontextual representation learning |
spellingShingle | Wazir Ali Jay Kumar Zenglin Xu Rajesh Kumar Yazhou Ren Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition Applied Sciences Sindhi named entity recognition recurrent neural networks CaBiLSTM self-attention mechanism contextual representation learning |
title | Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition |
title_full | Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition |
title_fullStr | Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition |
title_full_unstemmed | Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition |
title_short | Context-Aware Bidirectional Neural Model for Sindhi Named Entity Recognition |
title_sort | context aware bidirectional neural model for sindhi named entity recognition |
topic | Sindhi named entity recognition recurrent neural networks CaBiLSTM self-attention mechanism contextual representation learning |
url | https://www.mdpi.com/2076-3417/11/19/9038 |
work_keys_str_mv | AT wazirali contextawarebidirectionalneuralmodelforsindhinamedentityrecognition AT jaykumar contextawarebidirectionalneuralmodelforsindhinamedentityrecognition AT zenglinxu contextawarebidirectionalneuralmodelforsindhinamedentityrecognition AT rajeshkumar contextawarebidirectionalneuralmodelforsindhinamedentityrecognition AT yazhouren contextawarebidirectionalneuralmodelforsindhinamedentityrecognition |