TWE‐WSD: An effective topical word embedding based word sense disambiguation

Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and poly...

Full description

Bibliographic Details
Main Authors: Lianyin Jia, Jilin Tang, Mengjuan Li, Jinguo You, Jiaman Ding, Yinong Chen
Format: Article
Language:English
Published: Wiley 2021-03-01
Series:CAAI Transactions on Intelligence Technology
Subjects:
Online Access:https://doi.org/10.1049/cit2.12006
Description
Summary:Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and polysemy of the word; thus, their performances are limited. In order to address this problem, an effective topical word embedding (TWE)‐based WSD method, named TWE‐WSD, is proposed, which integrates Latent Dirichlet Allocation (LDA) and word embedding. Instead of generating a single word vector (WV) for each word, TWE‐WSD generates a topical WV for each word under each topic. Effective integrating strategies are designed to obtain high quality contextual vectors. Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods, especially on nouns.
ISSN:2468-2322