Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters

Abstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation...

Full description

Bibliographic Details
Main Authors: Jingfa Liu, Zhihe Yang, Xueming Yan, Duanbing Chen
Format: Article
Language:English
Published: Springer 2023-07-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-023-01121-4
_version_ 1827325309824794624
author Jingfa Liu
Zhihe Yang
Xueming Yan
Duanbing Chen
author_facet Jingfa Liu
Zhihe Yang
Xueming Yan
Duanbing Chen
author_sort Jingfa Liu
collection DOAJ
description Abstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation during crawling. To address this problem, we propose a dynamic adaptive procedure based on particle swarm optimization which dynamically updates weights in every crawling step and put forward a new focused crawler, called FCPSO. In FCPSO, we utilize domain ontology for topic representation and a comprehensive priority evaluation method to evaluate the topical priority of hyperlink. Furthermore, we construct a multi-objective optimization model for hyperlink selection, in which the strategy of the non-dominant sorting with the nearest farthest candidate solution is proposed to select Pareto-optimal hyperlinks and guide the crawling direction. Extensive experiments demonstrate the effectiveness of FCPSO over other strategies that it can obtain more topic-relevant webpages with less time consumption.
first_indexed 2024-03-07T14:25:17Z
format Article
id doaj.art-831d7b9209234feb86d56cb3a488e8e0
institution Directory Open Access Journal
issn 2199-4536
2198-6053
language English
last_indexed 2024-03-07T14:25:17Z
publishDate 2023-07-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj.art-831d7b9209234feb86d56cb3a488e8e02024-03-06T08:06:44ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-07-0110123325510.1007/s40747-023-01121-4Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disastersJingfa Liu0Zhihe Yang1Xueming Yan2Duanbing Chen3School of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesSchool of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesSchool of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesBig Data Research Center, University of Electronic Science and Technology of ChinaAbstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation during crawling. To address this problem, we propose a dynamic adaptive procedure based on particle swarm optimization which dynamically updates weights in every crawling step and put forward a new focused crawler, called FCPSO. In FCPSO, we utilize domain ontology for topic representation and a comprehensive priority evaluation method to evaluate the topical priority of hyperlink. Furthermore, we construct a multi-objective optimization model for hyperlink selection, in which the strategy of the non-dominant sorting with the nearest farthest candidate solution is proposed to select Pareto-optimal hyperlinks and guide the crawling direction. Extensive experiments demonstrate the effectiveness of FCPSO over other strategies that it can obtain more topic-relevant webpages with less time consumption.https://doi.org/10.1007/s40747-023-01121-4Focused crawlerHyperlink priority evaluationParticle swarm optimizationMeteorological disastersOntology
spellingShingle Jingfa Liu
Zhihe Yang
Xueming Yan
Duanbing Chen
Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
Complex & Intelligent Systems
Focused crawler
Hyperlink priority evaluation
Particle swarm optimization
Meteorological disasters
Ontology
title Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
title_full Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
title_fullStr Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
title_full_unstemmed Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
title_short Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
title_sort applying particle swarm optimization based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
topic Focused crawler
Hyperlink priority evaluation
Particle swarm optimization
Meteorological disasters
Ontology
url https://doi.org/10.1007/s40747-023-01121-4
work_keys_str_mv AT jingfaliu applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters
AT zhiheyang applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters
AT xuemingyan applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters
AT duanbingchen applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters