Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters
Abstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2023-07-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-023-01121-4 |
_version_ | 1827325309824794624 |
---|---|
author | Jingfa Liu Zhihe Yang Xueming Yan Duanbing Chen |
author_facet | Jingfa Liu Zhihe Yang Xueming Yan Duanbing Chen |
author_sort | Jingfa Liu |
collection | DOAJ |
description | Abstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation during crawling. To address this problem, we propose a dynamic adaptive procedure based on particle swarm optimization which dynamically updates weights in every crawling step and put forward a new focused crawler, called FCPSO. In FCPSO, we utilize domain ontology for topic representation and a comprehensive priority evaluation method to evaluate the topical priority of hyperlink. Furthermore, we construct a multi-objective optimization model for hyperlink selection, in which the strategy of the non-dominant sorting with the nearest farthest candidate solution is proposed to select Pareto-optimal hyperlinks and guide the crawling direction. Extensive experiments demonstrate the effectiveness of FCPSO over other strategies that it can obtain more topic-relevant webpages with less time consumption. |
first_indexed | 2024-03-07T14:25:17Z |
format | Article |
id | doaj.art-831d7b9209234feb86d56cb3a488e8e0 |
institution | Directory Open Access Journal |
issn | 2199-4536 2198-6053 |
language | English |
last_indexed | 2024-03-07T14:25:17Z |
publishDate | 2023-07-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj.art-831d7b9209234feb86d56cb3a488e8e02024-03-06T08:06:44ZengSpringerComplex & Intelligent Systems2199-45362198-60532023-07-0110123325510.1007/s40747-023-01121-4Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disastersJingfa Liu0Zhihe Yang1Xueming Yan2Duanbing Chen3School of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesSchool of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesSchool of Information Science and Technology and Guangzhou Key Laboratory of Multilingual Intelligent Processing, Guangdong University of Foreign StudiesBig Data Research Center, University of Electronic Science and Technology of ChinaAbstract Traditional semantic-based focused crawlers calculate the topical priority of hyperlink by linearly integrating topical similarity evaluation metrics and empirical weights. However, the manually pre-determined weights may introduce bias in evaluating hyperlinks, resulting in topic deviation during crawling. To address this problem, we propose a dynamic adaptive procedure based on particle swarm optimization which dynamically updates weights in every crawling step and put forward a new focused crawler, called FCPSO. In FCPSO, we utilize domain ontology for topic representation and a comprehensive priority evaluation method to evaluate the topical priority of hyperlink. Furthermore, we construct a multi-objective optimization model for hyperlink selection, in which the strategy of the non-dominant sorting with the nearest farthest candidate solution is proposed to select Pareto-optimal hyperlinks and guide the crawling direction. Extensive experiments demonstrate the effectiveness of FCPSO over other strategies that it can obtain more topic-relevant webpages with less time consumption.https://doi.org/10.1007/s40747-023-01121-4Focused crawlerHyperlink priority evaluationParticle swarm optimizationMeteorological disastersOntology |
spellingShingle | Jingfa Liu Zhihe Yang Xueming Yan Duanbing Chen Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters Complex & Intelligent Systems Focused crawler Hyperlink priority evaluation Particle swarm optimization Meteorological disasters Ontology |
title | Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
title_full | Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
title_fullStr | Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
title_full_unstemmed | Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
title_short | Applying particle swarm optimization-based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
title_sort | applying particle swarm optimization based dynamic adaptive hyperlink evaluation to focused crawler for meteorological disasters |
topic | Focused crawler Hyperlink priority evaluation Particle swarm optimization Meteorological disasters Ontology |
url | https://doi.org/10.1007/s40747-023-01121-4 |
work_keys_str_mv | AT jingfaliu applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters AT zhiheyang applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters AT xuemingyan applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters AT duanbingchen applyingparticleswarmoptimizationbaseddynamicadaptivehyperlinkevaluationtofocusedcrawlerformeteorologicaldisasters |