An Experimental Study of Convolutional Neural Networks for Functional and Subject Classification of Web Pages

Information filtering and information retrieving applications are based on web page classification methods. Usually, web pages serve different functionalities or develop different topics or subjects. The diversity of web page content increases the need for automatic web page classification, making i...

Full description

Bibliographic Details
Main Authors: Codruţ-Georgian Artene, Dumitru-Daniel Vecliuc, Marius Nicolae Tibeică, Florin Leon
Format: Article
Language:English
Published: World Scientific Publishing 2022-11-01
Series:Vietnam Journal of Computer Science
Subjects:
Online Access:https://www.worldscientific.com/doi/10.1142/S2196888822500245
Description
Summary:Information filtering and information retrieving applications are based on web page classification methods. Usually, web pages serve different functionalities or develop different topics or subjects. The diversity of web page content increases the need for automatic web page classification, making it a challenging task at the same time. Considering that the main component of the content of a web page is most often represented by the text and the classification of the text is a problem intensively studied in the last years, with researchers reporting state-of-the-art results for various methods, the idea of applying these methods on the text extracted from web pages could lead to important results. In this work, we revisit our experimental study on convolutional neural networks for multi-label multi-language web page classification with a new approach that consists of dividing the classification problem into functional classification and subject classification of web pages. From the experimental evaluation, one may conclude that the separation of the functional and subject classification of web pages leads to an improvement of the overall results.
ISSN:2196-8888
2196-8896