Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
Sememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labo...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-08-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/17/5996 |
_version_ | 1797555146738106368 |
---|---|
author | Xiaojun Kang Bing Li Hong Yao Qingzhong Liang Shengwen Li Junfang Gong Xinchuan Li |
author_facet | Xiaojun Kang Bing Li Hong Yao Qingzhong Liang Shengwen Li Junfang Gong Xinchuan Li |
author_sort | Xiaojun Kang |
collection | DOAJ |
description | Sememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labor-intensive. Sememes predictions can assign sememes to unlabeled words and are valuable work for automatically building and/or updating sememeknowledge bases (KBs). Existing methods are overdependent on the quality of the word embedding vectors, it remains a challenge for accurate sememe prediction. To address this problem, this study proposes a novel model to improve the performance of sememe prediction by introducing synonyms. The model scores candidate sememes from synonyms by combining distances of words in embedding vector space and derives an attention-based strategy to dynamically balance two kinds of knowledge from synonymous word set and word embedding vector. A series of experiments are performed, and the results show that the proposed model has made a significant improvement in the sememe prediction accuracy. The model provides a methodological reference for commonsense KB updating and embedding of commonsense knowledge. |
first_indexed | 2024-03-10T16:43:20Z |
format | Article |
id | doaj.art-d52ebd0573ce432b89e9cfb5275f83ff |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T16:43:20Z |
publishDate | 2020-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-d52ebd0573ce432b89e9cfb5275f83ff2023-11-20T11:52:25ZengMDPI AGApplied Sciences2076-34172020-08-011017599610.3390/app10175996Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based ModelXiaojun Kang0Bing Li1Hong Yao2Qingzhong Liang3Shengwen Li4Junfang Gong5Xinchuan Li6School of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, ChinaSchool of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labor-intensive. Sememes predictions can assign sememes to unlabeled words and are valuable work for automatically building and/or updating sememeknowledge bases (KBs). Existing methods are overdependent on the quality of the word embedding vectors, it remains a challenge for accurate sememe prediction. To address this problem, this study proposes a novel model to improve the performance of sememe prediction by introducing synonyms. The model scores candidate sememes from synonyms by combining distances of words in embedding vector space and derives an attention-based strategy to dynamically balance two kinds of knowledge from synonymous word set and word embedding vector. A series of experiments are performed, and the results show that the proposed model has made a significant improvement in the sememe prediction accuracy. The model provides a methodological reference for commonsense KB updating and embedding of commonsense knowledge.https://www.mdpi.com/2076-3417/10/17/5996natural language processingknowledge basecommonsensesememe predictionattention model |
spellingShingle | Xiaojun Kang Bing Li Hong Yao Qingzhong Liang Shengwen Li Junfang Gong Xinchuan Li Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model Applied Sciences natural language processing knowledge base commonsense sememe prediction attention model |
title | Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model |
title_full | Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model |
title_fullStr | Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model |
title_full_unstemmed | Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model |
title_short | Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model |
title_sort | incorporating synonym for lexical sememe prediction an attention based model |
topic | natural language processing knowledge base commonsense sememe prediction attention model |
url | https://www.mdpi.com/2076-3417/10/17/5996 |
work_keys_str_mv | AT xiaojunkang incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT bingli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT hongyao incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT qingzhongliang incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT shengwenli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT junfanggong incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel AT xinchuanli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel |