Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model

Sememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labo...

Full description

Bibliographic Details
Main Authors: Xiaojun Kang, Bing Li, Hong Yao, Qingzhong Liang, Shengwen Li, Junfang Gong, Xinchuan Li
Format: Article
Language:English
Published: MDPI AG 2020-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/17/5996
_version_ 1797555146738106368
author Xiaojun Kang
Bing Li
Hong Yao
Qingzhong Liang
Shengwen Li
Junfang Gong
Xinchuan Li
author_facet Xiaojun Kang
Bing Li
Hong Yao
Qingzhong Liang
Shengwen Li
Junfang Gong
Xinchuan Li
author_sort Xiaojun Kang
collection DOAJ
description Sememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labor-intensive. Sememes predictions can assign sememes to unlabeled words and are valuable work for automatically building and/or updating sememeknowledge bases (KBs). Existing methods are overdependent on the quality of the word embedding vectors, it remains a challenge for accurate sememe prediction. To address this problem, this study proposes a novel model to improve the performance of sememe prediction by introducing synonyms. The model scores candidate sememes from synonyms by combining distances of words in embedding vector space and derives an attention-based strategy to dynamically balance two kinds of knowledge from synonymous word set and word embedding vector. A series of experiments are performed, and the results show that the proposed model has made a significant improvement in the sememe prediction accuracy. The model provides a methodological reference for commonsense KB updating and embedding of commonsense knowledge.
first_indexed 2024-03-10T16:43:20Z
format Article
id doaj.art-d52ebd0573ce432b89e9cfb5275f83ff
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T16:43:20Z
publishDate 2020-08-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-d52ebd0573ce432b89e9cfb5275f83ff2023-11-20T11:52:25ZengMDPI AGApplied Sciences2076-34172020-08-011017599610.3390/app10175996Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based ModelXiaojun Kang0Bing Li1Hong Yao2Qingzhong Liang3Shengwen Li4Junfang Gong5Xinchuan Li6School of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSchool of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, ChinaSchool of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, ChinaSchool of Computer Science, China University of Geosciences, Wuhan 430074, ChinaSememe is the smallest semantic unit for describing real-world concepts, which improves the interpretability and performance of Natural Language Processing (NLP). To maintain the accuracy of the sememe description, its knowledge base needs to be continuously updated, which is time-consuming and labor-intensive. Sememes predictions can assign sememes to unlabeled words and are valuable work for automatically building and/or updating sememeknowledge bases (KBs). Existing methods are overdependent on the quality of the word embedding vectors, it remains a challenge for accurate sememe prediction. To address this problem, this study proposes a novel model to improve the performance of sememe prediction by introducing synonyms. The model scores candidate sememes from synonyms by combining distances of words in embedding vector space and derives an attention-based strategy to dynamically balance two kinds of knowledge from synonymous word set and word embedding vector. A series of experiments are performed, and the results show that the proposed model has made a significant improvement in the sememe prediction accuracy. The model provides a methodological reference for commonsense KB updating and embedding of commonsense knowledge.https://www.mdpi.com/2076-3417/10/17/5996natural language processingknowledge basecommonsensesememe predictionattention model
spellingShingle Xiaojun Kang
Bing Li
Hong Yao
Qingzhong Liang
Shengwen Li
Junfang Gong
Xinchuan Li
Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
Applied Sciences
natural language processing
knowledge base
commonsense
sememe prediction
attention model
title Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
title_full Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
title_fullStr Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
title_full_unstemmed Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
title_short Incorporating Synonym for Lexical Sememe Prediction: An Attention-Based Model
title_sort incorporating synonym for lexical sememe prediction an attention based model
topic natural language processing
knowledge base
commonsense
sememe prediction
attention model
url https://www.mdpi.com/2076-3417/10/17/5996
work_keys_str_mv AT xiaojunkang incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT bingli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT hongyao incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT qingzhongliang incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT shengwenli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT junfanggong incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel
AT xinchuanli incorporatingsynonymforlexicalsememepredictionanattentionbasedmodel