Unlabeled Short Text Similarity With LSTM Encoder
Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8570751/ |
_version_ | 1819179305733718016 |
---|---|
author | Lin Yao Zhengyu Pan Huansheng Ning |
author_facet | Lin Yao Zhengyu Pan Huansheng Ning |
author_sort | Lin Yao |
collection | DOAJ |
description | Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms. |
first_indexed | 2024-12-22T21:56:20Z |
format | Article |
id | doaj.art-9cc563158c9c4237b1eea27ea555a389 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T21:56:20Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-9cc563158c9c4237b1eea27ea555a3892022-12-21T18:11:15ZengIEEEIEEE Access2169-35362019-01-0173430343710.1109/ACCESS.2018.28856988570751Unlabeled Short Text Similarity With LSTM EncoderLin Yao0Zhengyu Pan1https://orcid.org/0000-0001-5841-5688Huansheng Ning2https://orcid.org/0000-0001-9437-2718School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, ChinaShort texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.https://ieeexplore.ieee.org/document/8570751/Unlabeledshort textsimilarity measurementLSTM encoder |
spellingShingle | Lin Yao Zhengyu Pan Huansheng Ning Unlabeled Short Text Similarity With LSTM Encoder IEEE Access Unlabeled short text similarity measurement LSTM encoder |
title | Unlabeled Short Text Similarity With LSTM Encoder |
title_full | Unlabeled Short Text Similarity With LSTM Encoder |
title_fullStr | Unlabeled Short Text Similarity With LSTM Encoder |
title_full_unstemmed | Unlabeled Short Text Similarity With LSTM Encoder |
title_short | Unlabeled Short Text Similarity With LSTM Encoder |
title_sort | unlabeled short text similarity with lstm encoder |
topic | Unlabeled short text similarity measurement LSTM encoder |
url | https://ieeexplore.ieee.org/document/8570751/ |
work_keys_str_mv | AT linyao unlabeledshorttextsimilaritywithlstmencoder AT zhengyupan unlabeledshorttextsimilaritywithlstmencoder AT huanshengning unlabeledshorttextsimilaritywithlstmencoder |