STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models
ABSTRACTShort text semantic similarity is a crucial research area in nature language processing, which is used to predict the similarity between two sentences. Due to the sparsity features of short texts, words are isolated in the sentence and the correlations of words are ignored, it is very diffic...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Taylor & Francis Group
2024-12-01
|
Series: | Applied Artificial Intelligence |
Online Access: | https://www.tandfonline.com/doi/10.1080/08839514.2024.2321552 |
_version_ | 1797279036994486272 |
---|---|
author | Hai Liao Yan Liang Song Chen Lingyun Xiang Zhimin Chang Yun Xiao |
author_facet | Hai Liao Yan Liang Song Chen Lingyun Xiang Zhimin Chang Yun Xiao |
author_sort | Hai Liao |
collection | DOAJ |
description | ABSTRACTShort text semantic similarity is a crucial research area in nature language processing, which is used to predict the similarity between two sentences. Due to the sparsity features of short texts, words are isolated in the sentence and the correlations of words are ignored, it is very difficult to calculate the global semantic information. Based on this, short text semantic graph (STSG) model based on dependency parsing and pre-trained language models is proposed in this paper. It utilizes the syntactic information to obtain word dependency relationships and incorporate it into pre-trained language models to enhance the global semantic information of sentences. So it can address the semantic sparsity more effectively. A text semantic graph layer based on the graph attention network (GAT) is also realized, which regards word vectors as node features and word dependency as edge features. The attention mechanism of GAT can identify the importance of different word correlations and solve the word dependency modeling effectively. On the challenging short text semantic benchmark dataset MRPC, the STSG model achieves an F1-score of .946, which is further improved 2.16% over previous SOTA approaches. At the time of writing, STSG has achieved a new SOTA performance on the MRPC dataset. |
first_indexed | 2024-03-07T16:17:38Z |
format | Article |
id | doaj.art-2e0b2d0c0e6a4cfdbf704bbc290efa2c |
institution | Directory Open Access Journal |
issn | 0883-9514 1087-6545 |
language | English |
last_indexed | 2024-03-07T16:17:38Z |
publishDate | 2024-12-01 |
publisher | Taylor & Francis Group |
record_format | Article |
series | Applied Artificial Intelligence |
spelling | doaj.art-2e0b2d0c0e6a4cfdbf704bbc290efa2c2024-03-04T09:05:44ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452024-12-0138110.1080/08839514.2024.2321552STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language ModelsHai Liao0Yan Liang1Song Chen2Lingyun Xiang3Zhimin Chang4Yun Xiao5School of Computer Engineering, Chengdu Technological University, Chengdu, ChinaSchool of Computer Engineering, Chengdu Technological University, Chengdu, ChinaSchool of Computer Engineering, Chengdu Technological University, Chengdu, ChinaFaculty of Informatics, Eötvös Loránd University, Budapest, HungaryThe Research and Development Department, HAN Networks Corporation Limited, Beijing, ChinaSchool of Software, Sichuan Vocational College of Information Technology, Guangyuan, ChinaABSTRACTShort text semantic similarity is a crucial research area in nature language processing, which is used to predict the similarity between two sentences. Due to the sparsity features of short texts, words are isolated in the sentence and the correlations of words are ignored, it is very difficult to calculate the global semantic information. Based on this, short text semantic graph (STSG) model based on dependency parsing and pre-trained language models is proposed in this paper. It utilizes the syntactic information to obtain word dependency relationships and incorporate it into pre-trained language models to enhance the global semantic information of sentences. So it can address the semantic sparsity more effectively. A text semantic graph layer based on the graph attention network (GAT) is also realized, which regards word vectors as node features and word dependency as edge features. The attention mechanism of GAT can identify the importance of different word correlations and solve the word dependency modeling effectively. On the challenging short text semantic benchmark dataset MRPC, the STSG model achieves an F1-score of .946, which is further improved 2.16% over previous SOTA approaches. At the time of writing, STSG has achieved a new SOTA performance on the MRPC dataset.https://www.tandfonline.com/doi/10.1080/08839514.2024.2321552 |
spellingShingle | Hai Liao Yan Liang Song Chen Lingyun Xiang Zhimin Chang Yun Xiao STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models Applied Artificial Intelligence |
title | STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models |
title_full | STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models |
title_fullStr | STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models |
title_full_unstemmed | STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models |
title_short | STSG: A Short Text Semantic Graph Model for Similarity Computing Based on Dependency Parsing and Pre-trained Language Models |
title_sort | stsg a short text semantic graph model for similarity computing based on dependency parsing and pre trained language models |
url | https://www.tandfonline.com/doi/10.1080/08839514.2024.2321552 |
work_keys_str_mv | AT hailiao stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels AT yanliang stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels AT songchen stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels AT lingyunxiang stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels AT zhiminchang stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels AT yunxiao stsgashorttextsemanticgraphmodelforsimilaritycomputingbasedondependencyparsingandpretrainedlanguagemodels |