Unlabeled Short Text Similarity With LSTM Encoder

Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing...

Full description

Bibliographic Details
Main Authors: Lin Yao, Zhengyu Pan, Huansheng Ning
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8570751/
_version_ 1819179305733718016
author Lin Yao
Zhengyu Pan
Huansheng Ning
author_facet Lin Yao
Zhengyu Pan
Huansheng Ning
author_sort Lin Yao
collection DOAJ
description Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.
first_indexed 2024-12-22T21:56:20Z
format Article
id doaj.art-9cc563158c9c4237b1eea27ea555a389
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T21:56:20Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-9cc563158c9c4237b1eea27ea555a3892022-12-21T18:11:15ZengIEEEIEEE Access2169-35362019-01-0173430343710.1109/ACCESS.2018.28856988570751Unlabeled Short Text Similarity With LSTM EncoderLin Yao0Zhengyu Pan1https://orcid.org/0000-0001-5841-5688Huansheng Ning2https://orcid.org/0000-0001-9437-2718School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaShort texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.https://ieeexplore.ieee.org/document/8570751/Unlabeledshort textsimilarity measurementLSTM encoder
spellingShingle Lin Yao
Zhengyu Pan
Huansheng Ning
Unlabeled Short Text Similarity With LSTM Encoder
IEEE Access
Unlabeled
short text
similarity measurement
LSTM encoder
title Unlabeled Short Text Similarity With LSTM Encoder
title_full Unlabeled Short Text Similarity With LSTM Encoder
title_fullStr Unlabeled Short Text Similarity With LSTM Encoder
title_full_unstemmed Unlabeled Short Text Similarity With LSTM Encoder
title_short Unlabeled Short Text Similarity With LSTM Encoder
title_sort unlabeled short text similarity with lstm encoder
topic Unlabeled
short text
similarity measurement
LSTM encoder
url https://ieeexplore.ieee.org/document/8570751/
work_keys_str_mv AT linyao unlabeledshorttextsimilaritywithlstmencoder
AT zhengyupan unlabeledshorttextsimilaritywithlstmencoder
AT huanshengning unlabeledshorttextsimilaritywithlstmencoder