Short Text Semantic Similarity Measurement Approach Based on Semantic Network

Estimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calcula...

Full description

Bibliographic Details
Main Authors:	Naamah Hussien Hameed, Adel M. Alimi, Ahmed T. Sadiq
Format:	Article
Language:	Arabic
Published:	College of Science for Women, University of Baghdad 2022-12-01
Series:	Baghdad Science Journal
Online Access:	https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/7255

_version_	1811178519213899776
author	Naamah Hussien Hameed Adel M. Alimi Ahmed T. Sadiq
author_facet	Naamah Hussien Hameed Adel M. Alimi Ahmed T. Sadiq
author_sort	Naamah Hussien Hameed
collection	DOAJ
description	Estimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calculating the degree of similarity between two texts, based on the words they share, do not perform well with short texts because two similar texts may be written in different terms by employing synonyms. As a result, short texts should be semantically compared. In this paper, a semantic similarity measurement method between texts is presented which combines knowledge-based and corpus-based semantic information to build a semantic network that represents the relationship between the compared texts and extracts the degree of similarity between them. Representing a text as a semantic network is the best knowledge representation that comes close to the human mind's understanding of the texts, where the semantic network reflects the sentence's semantic, syntactical, and structural knowledge. The network representation is a visual representation of knowledge objects, their qualities, and their relationships. WordNet lexical database has been used as a knowledge-based source while the GloVe pre-trained word embedding vectors have been used as a corpus-based source. The proposed method was tested using three different datasets, DSCS, SICK, and MOHLER datasets. A good result has been obtained in terms of RMSE and MAE.
first_indexed	2024-04-11T06:19:47Z
format	Article
id	doaj.art-81ac293f56bc4397aa203c2bfbcab1c3
institution	Directory Open Access Journal
issn	2078-8665 2411-7986
language	Arabic
last_indexed	2024-04-11T06:19:47Z
publishDate	2022-12-01
publisher	College of Science for Women, University of Baghdad
record_format	Article
series	Baghdad Science Journal
spelling	doaj.art-81ac293f56bc4397aa203c2bfbcab1c32022-12-22T04:40:48ZaraCollege of Science for Women, University of BaghdadBaghdad Science Journal2078-86652411-79862022-12-01196(Suppl.)10.21123/bsj.2022.7255 Short Text Semantic Similarity Measurement Approach Based on Semantic NetworkNaamah Hussien Hameed0Adel M. Alimi1Ahmed T. Sadiq2Computer Science Department, University of Technology, Baghdad, Iraq.1Computer Science Department, University of Technology, Baghdad, Iraq.Computer Science Department, University of Technology, Baghdad, Iraq. Estimating the semantic similarity between short texts plays an increasingly prominent role in many fields related to text mining and natural language processing applications, especially with the large increase in the volume of textual data that is produced daily. Traditional approaches for calculating the degree of similarity between two texts, based on the words they share, do not perform well with short texts because two similar texts may be written in different terms by employing synonyms. As a result, short texts should be semantically compared. In this paper, a semantic similarity measurement method between texts is presented which combines knowledge-based and corpus-based semantic information to build a semantic network that represents the relationship between the compared texts and extracts the degree of similarity between them. Representing a text as a semantic network is the best knowledge representation that comes close to the human mind's understanding of the texts, where the semantic network reflects the sentence's semantic, syntactical, and structural knowledge. The network representation is a visual representation of knowledge objects, their qualities, and their relationships. WordNet lexical database has been used as a knowledge-based source while the GloVe pre-trained word embedding vectors have been used as a corpus-based source. The proposed method was tested using three different datasets, DSCS, SICK, and MOHLER datasets. A good result has been obtained in terms of RMSE and MAE. https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/7255
spellingShingle	Naamah Hussien Hameed Adel M. Alimi Ahmed T. Sadiq Short Text Semantic Similarity Measurement Approach Based on Semantic Network Baghdad Science Journal
title	Short Text Semantic Similarity Measurement Approach Based on Semantic Network
title_full	Short Text Semantic Similarity Measurement Approach Based on Semantic Network
title_fullStr	Short Text Semantic Similarity Measurement Approach Based on Semantic Network
title_full_unstemmed	Short Text Semantic Similarity Measurement Approach Based on Semantic Network
title_short	Short Text Semantic Similarity Measurement Approach Based on Semantic Network
title_sort	short text semantic similarity measurement approach based on semantic network
url	https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/7255
work_keys_str_mv	AT naamahhussienhameed shorttextsemanticsimilaritymeasurementapproachbasedonsemanticnetwork AT adelmalimi shorttextsemanticsimilaritymeasurementapproachbasedonsemanticnetwork AT ahmedtsadiq shorttextsemanticsimilaritymeasurementapproachbasedonsemanticnetwork

Short Text Semantic Similarity Measurement Approach Based on Semantic Network

Similar Items