A Lexical Resource-Constrained Topic Model for Word Relatedness
Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8703742/ |
_version_ | 1818649103545925632 |
---|---|
author | Yongjing Yin Jiali Zeng Hongji Wang Keqing Wu Bin Luo Jinsong Su |
author_facet | Yongjing Yin Jiali Zeng Hongji Wang Keqing Wu Bin Luo Jinsong Su |
author_sort | Yongjing Yin |
collection | DOAJ |
description | Word relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement. |
first_indexed | 2024-12-17T01:29:00Z |
format | Article |
id | doaj.art-a6b50515027141f2b10cfd759827db40 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T01:29:00Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-a6b50515027141f2b10cfd759827db402022-12-21T22:08:37ZengIEEEIEEE Access2169-35362019-01-017552615526810.1109/ACCESS.2019.29091048703742A Lexical Resource-Constrained Topic Model for Word RelatednessYongjing Yin0https://orcid.org/0000-0003-1138-4612Jiali Zeng1Hongji Wang2Keqing Wu3Bin Luo4Jinsong Su5Software School, Xiamen University, Xiamen, ChinaSoftware School, Xiamen University, Xiamen, ChinaSoftware School, Xiamen University, Xiamen, ChinaSoftware School, Xiamen University, Xiamen, ChinaSoftware School, Xiamen University, Xiamen, ChinaSoftware School, Xiamen University, Xiamen, ChinaWord relatedness computation is an important supporting technology for many tasks in natural language processing. Traditionally, there have been two distinct strategies for word relatedness measurement: one utilizes corpus-based models, whereas the other leverages external lexical resources. However, either solution has its strengths and weaknesses. In this paper, we propose a lexical resource-constrained topic model to integrate the two complementary strategies effectively. Our model is an extension of probabilistic latent semantic analysis, which automatically learns word-level distributed representations forward relatedness measurement. Furthermore, we introduce generalized expectation maximization (GEM) algorithm for statistical estimation. The proposed model not merely inherit the advantage of conventional topic models in dimension reduction, but it also refines parameter estimation by using word pairs that are known to be related. The experimental results in different languages demonstrate the effectiveness of our model in topic extraction and word relatedness measurement.https://ieeexplore.ieee.org/document/8703742/Natural language processingunsupervised learning |
spellingShingle | Yongjing Yin Jiali Zeng Hongji Wang Keqing Wu Bin Luo Jinsong Su A Lexical Resource-Constrained Topic Model for Word Relatedness IEEE Access Natural language processing unsupervised learning |
title | A Lexical Resource-Constrained Topic Model for Word Relatedness |
title_full | A Lexical Resource-Constrained Topic Model for Word Relatedness |
title_fullStr | A Lexical Resource-Constrained Topic Model for Word Relatedness |
title_full_unstemmed | A Lexical Resource-Constrained Topic Model for Word Relatedness |
title_short | A Lexical Resource-Constrained Topic Model for Word Relatedness |
title_sort | lexical resource constrained topic model for word relatedness |
topic | Natural language processing unsupervised learning |
url | https://ieeexplore.ieee.org/document/8703742/ |
work_keys_str_mv | AT yongjingyin alexicalresourceconstrainedtopicmodelforwordrelatedness AT jializeng alexicalresourceconstrainedtopicmodelforwordrelatedness AT hongjiwang alexicalresourceconstrainedtopicmodelforwordrelatedness AT keqingwu alexicalresourceconstrainedtopicmodelforwordrelatedness AT binluo alexicalresourceconstrainedtopicmodelforwordrelatedness AT jinsongsu alexicalresourceconstrainedtopicmodelforwordrelatedness AT yongjingyin lexicalresourceconstrainedtopicmodelforwordrelatedness AT jializeng lexicalresourceconstrainedtopicmodelforwordrelatedness AT hongjiwang lexicalresourceconstrainedtopicmodelforwordrelatedness AT keqingwu lexicalresourceconstrainedtopicmodelforwordrelatedness AT binluo lexicalresourceconstrainedtopicmodelforwordrelatedness AT jinsongsu lexicalresourceconstrainedtopicmodelforwordrelatedness |