Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles

Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for...

Full description

Bibliographic Details
Main Authors: Miah, Mohammad Badrul Alam, Suryanti, Awang, Md.Saiful, Azad
Format: Conference or Workshop Item
Language:English
Published: IEEE 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf
_version_ 1796994916042145792
author Miah, Mohammad Badrul Alam
Suryanti, Awang
Md.Saiful, Azad
author_facet Miah, Mohammad Badrul Alam
Suryanti, Awang
Md.Saiful, Azad
author_sort Miah, Mohammad Badrul Alam
collection UMP
description Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for those keyphrases execute a vital role in extracting the top-quality keyphrases and summarising the documents at a superior level. This paper proposes a new region-based distance analysis of keyphrases (RDAK) unsupervised technique for feature extraction of keyphrases from articles. The proposed method comprises six phases: data acquisition and preprocessing, data processing, distance calculation, average distance, curve plotting, and curve fitting. At first, the system inputs the collected different datasets to the preprocessing step by employing some text preprocessing techniques. Afterwards, the preprocessed data is applied to the data processing phase, and then after distance calculation, it is passed to the region-based average calculation process, then curve plotting analysis, and afterwards, the curve fitting technique is utilized. Finally, the proposed system has tested and evaluated the performance through implementing them on benchmark datasets. The proposed system will significantly improve the performance of existing keyphrase extraction techniques.
first_indexed 2024-03-06T12:54:39Z
format Conference or Workshop Item
id UMPir33128
institution Universiti Malaysia Pahang
language English
last_indexed 2024-03-06T12:54:39Z
publishDate 2021
publisher IEEE
record_format dspace
spelling UMPir331282024-01-05T07:38:18Z http://umpir.ump.edu.my/id/eprint/33128/ Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles Miah, Mohammad Badrul Alam Suryanti, Awang Md.Saiful, Azad QA76 Computer software Due to the exponential growth of information’s and web sources, Automatic keyphrase extraction is still a challenging issue in the current research area. Keyphrases are very helpful for several tasks in natural language processing (NLP) and information retrieval (IR) systems. Feature extractions for those keyphrases execute a vital role in extracting the top-quality keyphrases and summarising the documents at a superior level. This paper proposes a new region-based distance analysis of keyphrases (RDAK) unsupervised technique for feature extraction of keyphrases from articles. The proposed method comprises six phases: data acquisition and preprocessing, data processing, distance calculation, average distance, curve plotting, and curve fitting. At first, the system inputs the collected different datasets to the preprocessing step by employing some text preprocessing techniques. Afterwards, the preprocessed data is applied to the data processing phase, and then after distance calculation, it is passed to the region-based average calculation process, then curve plotting analysis, and afterwards, the curve fitting technique is utilized. Finally, the proposed system has tested and evaluated the performance through implementing them on benchmark datasets. The proposed system will significantly improve the performance of existing keyphrase extraction techniques. IEEE 2021 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf Miah, Mohammad Badrul Alam and Suryanti, Awang and Md.Saiful, Azad (2021) Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles. In: International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM 2021) , 24-26 August 2021 , Pekan, Pahang, Malaysia. pp. 124-129.. ISBN 978-1-6654-1407-4 https://doi.org/10.1109/ICSECS52883.2021.00030
spellingShingle QA76 Computer software
Miah, Mohammad Badrul Alam
Suryanti, Awang
Md.Saiful, Azad
Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_full Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_fullStr Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_full_unstemmed Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_short Region-Based Distance Analysis of Keyphrases: A New Unsupervised Method for Extracting Keyphrases Feature from Articles
title_sort region based distance analysis of keyphrases a new unsupervised method for extracting keyphrases feature from articles
topic QA76 Computer software
url http://umpir.ump.edu.my/id/eprint/33128/7/Region-Based%20Distance%20Analysis%20of%20Keyphrases1.pdf
work_keys_str_mv AT miahmohammadbadrulalam regionbaseddistanceanalysisofkeyphrasesanewunsupervisedmethodforextractingkeyphrasesfeaturefromarticles
AT suryantiawang regionbaseddistanceanalysisofkeyphrasesanewunsupervisedmethodforextractingkeyphrasesfeaturefromarticles
AT mdsaifulazad regionbaseddistanceanalysisofkeyphrasesanewunsupervisedmethodforextractingkeyphrasesfeaturefromarticles