Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology

Finding, retrieving, and processing information on technology from the Internet can be a tedious task. This article investigates if technological concepts such as web crawling and natural language processing are suitable means for knowledge discovery from unstructured information and the development...

Full description

Bibliographic Details
Main Authors: Nathalie Campos Macias, Wilhelm Düggelin, Yesim Ruf, Thomas Hanne
Format: Article
Language:English
Published: MDPI AG 2022-08-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/15/8/272
_version_ 1797432587347558400
author Nathalie Campos Macias
Wilhelm Düggelin
Yesim Ruf
Thomas Hanne
author_facet Nathalie Campos Macias
Wilhelm Düggelin
Yesim Ruf
Thomas Hanne
author_sort Nathalie Campos Macias
collection DOAJ
description Finding, retrieving, and processing information on technology from the Internet can be a tedious task. This article investigates if technological concepts such as web crawling and natural language processing are suitable means for knowledge discovery from unstructured information and the development of a technology recommender system by developing a prototype of such a system. It also analyzes how well the resulting prototype performs in regard to effectivity and efficiency. The research strategy based on design science research consists of four stages: (1) Awareness generation; (2) suggestion of a solution considering the information retrieval process; (3) development of an artefact in the form of a Python computer program; and (4) evaluation of the prototype within the scope of a comparative experiment. The evaluation yields that the prototype is highly efficient in retrieving basic and rather random extractive text summaries from websites that include the desired search terms. However, the effectivity, measured by the quality of results is unsatisfactory due to the aforementioned random arrangement of extracted sentences within the resulting summaries. It is found that natural language processing and web crawling are indeed suitable technologies for such a program whilst the use of additional technology/concepts would add significant value for a potential user. Several areas for incremental improvement of the prototype are identified.
first_indexed 2024-03-09T10:03:43Z
format Article
id doaj.art-b86e73b6827e49e09020d10f976b33cf
institution Directory Open Access Journal
issn 1999-4893
language English
last_indexed 2024-03-09T10:03:43Z
publishDate 2022-08-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj.art-b86e73b6827e49e09020d10f976b33cf2023-12-01T23:17:31ZengMDPI AGAlgorithms1999-48932022-08-0115827210.3390/a15080272Building a Technology Recommender System Using Web Crawling and Natural Language Processing TechnologyNathalie Campos Macias0Wilhelm Düggelin1Yesim Ruf2Thomas Hanne3Institute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, 4600 Olten, SwitzerlandInstitute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, 4600 Olten, SwitzerlandInstitute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, 4600 Olten, SwitzerlandInstitute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, 4600 Olten, SwitzerlandFinding, retrieving, and processing information on technology from the Internet can be a tedious task. This article investigates if technological concepts such as web crawling and natural language processing are suitable means for knowledge discovery from unstructured information and the development of a technology recommender system by developing a prototype of such a system. It also analyzes how well the resulting prototype performs in regard to effectivity and efficiency. The research strategy based on design science research consists of four stages: (1) Awareness generation; (2) suggestion of a solution considering the information retrieval process; (3) development of an artefact in the form of a Python computer program; and (4) evaluation of the prototype within the scope of a comparative experiment. The evaluation yields that the prototype is highly efficient in retrieving basic and rather random extractive text summaries from websites that include the desired search terms. However, the effectivity, measured by the quality of results is unsatisfactory due to the aforementioned random arrangement of extracted sentences within the resulting summaries. It is found that natural language processing and web crawling are indeed suitable technologies for such a program whilst the use of additional technology/concepts would add significant value for a potential user. Several areas for incremental improvement of the prototype are identified.https://www.mdpi.com/1999-4893/15/8/272recommender systemsweb crawlingnatural language processing
spellingShingle Nathalie Campos Macias
Wilhelm Düggelin
Yesim Ruf
Thomas Hanne
Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
Algorithms
recommender systems
web crawling
natural language processing
title Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
title_full Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
title_fullStr Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
title_full_unstemmed Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
title_short Building a Technology Recommender System Using Web Crawling and Natural Language Processing Technology
title_sort building a technology recommender system using web crawling and natural language processing technology
topic recommender systems
web crawling
natural language processing
url https://www.mdpi.com/1999-4893/15/8/272
work_keys_str_mv AT nathaliecamposmacias buildingatechnologyrecommendersystemusingwebcrawlingandnaturallanguageprocessingtechnology
AT wilhelmduggelin buildingatechnologyrecommendersystemusingwebcrawlingandnaturallanguageprocessingtechnology
AT yesimruf buildingatechnologyrecommendersystemusingwebcrawlingandnaturallanguageprocessingtechnology
AT thomashanne buildingatechnologyrecommendersystemusingwebcrawlingandnaturallanguageprocessingtechnology