Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis

Textual data are a rich source of knowledge hence, sentence comparison has become one of the important tasks in text mining related works.Most previous work in text comparison are performed at document level, research suggest that comparing sentence level text is a non-trivial problem.One of the rea...

Full description

Bibliographic Details
Main Authors: Kamaruddin, Siti Sakira, Yusof, Yuhanis, Abu Bakar, Nur Azzah, Ahmed Tayie, Mohamed, Abdulsattar A.Jabbar Alkubaisi, Ghaith
Format: Article
Language:English
Published: Science Publishing Corporation Inc 2018
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/24418/1/IJET%207%202.14%202018%2032%2035.pdf
_version_ 1803628717928349696
author Kamaruddin, Siti Sakira
Yusof, Yuhanis
Abu Bakar, Nur Azzah
Ahmed Tayie, Mohamed
Abdulsattar A.Jabbar Alkubaisi, Ghaith
author_facet Kamaruddin, Siti Sakira
Yusof, Yuhanis
Abu Bakar, Nur Azzah
Ahmed Tayie, Mohamed
Abdulsattar A.Jabbar Alkubaisi, Ghaith
author_sort Kamaruddin, Siti Sakira
collection UUM
description Textual data are a rich source of knowledge hence, sentence comparison has become one of the important tasks in text mining related works.Most previous work in text comparison are performed at document level, research suggest that comparing sentence level text is a non-trivial problem.One of the reason is two sentences can convey the same meaning with totally dissimilar words.This paper presents the results of a comparative analysis on three representation schemes i.e. term frequency inverse document frequency, Latent Semantic Analysis and Graph based representation using three similarity measures i.e. Cosine, Dice coefficient and Jaccard similarity to compare the similarity of sentences.Results reveal that the graph based representation and the Jaccard similarity measure outperforms the others in terms of precision, recall and F-measures.
first_indexed 2024-07-04T06:26:24Z
format Article
id uum-24418
institution Universiti Utara Malaysia
language English
last_indexed 2024-07-04T06:26:24Z
publishDate 2018
publisher Science Publishing Corporation Inc
record_format dspace
spelling uum-244182018-07-18T06:16:22Z https://repo.uum.edu.my/id/eprint/24418/ Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis Kamaruddin, Siti Sakira Yusof, Yuhanis Abu Bakar, Nur Azzah Ahmed Tayie, Mohamed Abdulsattar A.Jabbar Alkubaisi, Ghaith QA75 Electronic computers. Computer science Textual data are a rich source of knowledge hence, sentence comparison has become one of the important tasks in text mining related works.Most previous work in text comparison are performed at document level, research suggest that comparing sentence level text is a non-trivial problem.One of the reason is two sentences can convey the same meaning with totally dissimilar words.This paper presents the results of a comparative analysis on three representation schemes i.e. term frequency inverse document frequency, Latent Semantic Analysis and Graph based representation using three similarity measures i.e. Cosine, Dice coefficient and Jaccard similarity to compare the similarity of sentences.Results reveal that the graph based representation and the Jaccard similarity measure outperforms the others in terms of precision, recall and F-measures. Science Publishing Corporation Inc 2018 Article PeerReviewed application/pdf en cc_by https://repo.uum.edu.my/id/eprint/24418/1/IJET%207%202.14%202018%2032%2035.pdf Kamaruddin, Siti Sakira and Yusof, Yuhanis and Abu Bakar, Nur Azzah and Ahmed Tayie, Mohamed and Abdulsattar A.Jabbar Alkubaisi, Ghaith (2018) Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis. International Journal of Engineering & Technology, 7 (2.14). pp. 32-35. ISSN 2227-524X http://doi.org/10.14419/ijet.v7i2.14.11149 doi:10.14419/ijet.v7i2.14.11149 doi:10.14419/ijet.v7i2.14.11149
spellingShingle QA75 Electronic computers. Computer science
Kamaruddin, Siti Sakira
Yusof, Yuhanis
Abu Bakar, Nur Azzah
Ahmed Tayie, Mohamed
Abdulsattar A.Jabbar Alkubaisi, Ghaith
Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title_full Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title_fullStr Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title_full_unstemmed Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title_short Graph-based Representation for Sentence Similarity Measure : A Comparative Analysis
title_sort graph based representation for sentence similarity measure a comparative analysis
topic QA75 Electronic computers. Computer science
url https://repo.uum.edu.my/id/eprint/24418/1/IJET%207%202.14%202018%2032%2035.pdf
work_keys_str_mv AT kamaruddinsitisakira graphbasedrepresentationforsentencesimilaritymeasureacomparativeanalysis
AT yusofyuhanis graphbasedrepresentationforsentencesimilaritymeasureacomparativeanalysis
AT abubakarnurazzah graphbasedrepresentationforsentencesimilaritymeasureacomparativeanalysis
AT ahmedtayiemohamed graphbasedrepresentationforsentencesimilaritymeasureacomparativeanalysis
AT abdulsattarajabbaralkubaisighaith graphbasedrepresentationforsentencesimilaritymeasureacomparativeanalysis