Autonomous schema markups based on intelligent computing for search engine optimization

With advances in artificial intelligence and semantic technology, search engines are integrating semantics to address complex search queries to improve the results. This requires identification of well-known concepts or entities and their relationship from web page contents. But the increase in comp...

Full description

Bibliographic Details
Main Authors:	Burhan Ud Din Abbasi, Iram Fatima, Hamid Mukhtar, Sharifullah Khan, Abdulaziz Alhumam, Hafiz Farooq Ahmad
Format:	Article
Language:	English
Published:	PeerJ Inc. 2022-12-01
Series:	PeerJ Computer Science
Subjects:	Schema.org Search engine optimization Semantic search Unstructured data Content discovery
Online Access:	https://peerj.com/articles/cs-1163.pdf

_version_	1811188825759678464
author	Burhan Ud Din Abbasi Iram Fatima Hamid Mukhtar Sharifullah Khan Abdulaziz Alhumam Hafiz Farooq Ahmad
author_facet	Burhan Ud Din Abbasi Iram Fatima Hamid Mukhtar Sharifullah Khan Abdulaziz Alhumam Hafiz Farooq Ahmad
author_sort	Burhan Ud Din Abbasi
collection	DOAJ
description	With advances in artificial intelligence and semantic technology, search engines are integrating semantics to address complex search queries to improve the results. This requires identification of well-known concepts or entities and their relationship from web page contents. But the increase in complex unstructured data on web pages has made the task of concept identification overly complex. Existing research focuses on entity recognition from the perspective of linguistic structures such as complete sentences and paragraphs, whereas a huge part of the data on web pages exists as unstructured text fragments enclosed in HTML tags. Ontologies provide schemas to structure the data on the web. However, including them in the web pages requires additional resources and expertise from organizations or webmasters and thus becoming a major hindrance in their large-scale adoption. We propose an approach for autonomous identification of entities from short text present in web pages to populate semantic models based on a specific ontology model. The proposed approach has been applied to a public dataset containing academic web pages. We employ a long short-term memory (LSTM) deep learning network and the random forest machine learning algorithm to predict entities. The proposed methodology gives an overall accuracy of 0.94 on the test dataset, indicating a potential for automated prediction even in the case of a limited number of training samples for various entities, thus, significantly reducing the required manual workload in practical applications.
first_indexed	2024-04-11T14:25:46Z
format	Article
id	doaj.art-a4c4444fffe543d49fb7ed8af4c7aca6
institution	Directory Open Access Journal
issn	2376-5992
language	English
last_indexed	2024-04-11T14:25:46Z
publishDate	2022-12-01
publisher	PeerJ Inc.
record_format	Article
series	PeerJ Computer Science
spelling	doaj.art-a4c4444fffe543d49fb7ed8af4c7aca62022-12-22T04:18:53ZengPeerJ Inc.PeerJ Computer Science2376-59922022-12-018e116310.7717/peerj-cs.1163Autonomous schema markups based on intelligent computing for search engine optimizationBurhan Ud Din Abbasi0Iram Fatima1Hamid Mukhtar2Sharifullah Khan3Abdulaziz Alhumam4Hafiz Farooq Ahmad5Department of Computer Science, Bahria University, Islamabad, PakistanSchema App-Hunch Manifest Inc, Guelph, CanadaDepartment of Computer Science, College of Engineering and Physical Sciences (EPS), University of Birmingham Dubai, Dubai, United Arab EmiratesPAF-Institute of Applied Sciences and Technology, Haripur, PakistanComputer Science Department, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, Al-Ahsa, Saudi ArabiaComputer Science Department, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, Al-Ahsa, Saudi ArabiaWith advances in artificial intelligence and semantic technology, search engines are integrating semantics to address complex search queries to improve the results. This requires identification of well-known concepts or entities and their relationship from web page contents. But the increase in complex unstructured data on web pages has made the task of concept identification overly complex. Existing research focuses on entity recognition from the perspective of linguistic structures such as complete sentences and paragraphs, whereas a huge part of the data on web pages exists as unstructured text fragments enclosed in HTML tags. Ontologies provide schemas to structure the data on the web. However, including them in the web pages requires additional resources and expertise from organizations or webmasters and thus becoming a major hindrance in their large-scale adoption. We propose an approach for autonomous identification of entities from short text present in web pages to populate semantic models based on a specific ontology model. The proposed approach has been applied to a public dataset containing academic web pages. We employ a long short-term memory (LSTM) deep learning network and the random forest machine learning algorithm to predict entities. The proposed methodology gives an overall accuracy of 0.94 on the test dataset, indicating a potential for automated prediction even in the case of a limited number of training samples for various entities, thus, significantly reducing the required manual workload in practical applications.https://peerj.com/articles/cs-1163.pdfSchema.orgSearch engine optimizationSemantic searchUnstructured dataContent discovery
spellingShingle	Burhan Ud Din Abbasi Iram Fatima Hamid Mukhtar Sharifullah Khan Abdulaziz Alhumam Hafiz Farooq Ahmad Autonomous schema markups based on intelligent computing for search engine optimization PeerJ Computer Science Schema.org Search engine optimization Semantic search Unstructured data Content discovery
title	Autonomous schema markups based on intelligent computing for search engine optimization
title_full	Autonomous schema markups based on intelligent computing for search engine optimization
title_fullStr	Autonomous schema markups based on intelligent computing for search engine optimization
title_full_unstemmed	Autonomous schema markups based on intelligent computing for search engine optimization
title_short	Autonomous schema markups based on intelligent computing for search engine optimization
title_sort	autonomous schema markups based on intelligent computing for search engine optimization
topic	Schema.org Search engine optimization Semantic search Unstructured data Content discovery
url	https://peerj.com/articles/cs-1163.pdf
work_keys_str_mv	AT burhanuddinabbasi autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization AT iramfatima autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization AT hamidmukhtar autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization AT sharifullahkhan autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization AT abdulazizalhumam autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization AT hafizfarooqahmad autonomousschemamarkupsbasedonintelligentcomputingforsearchengineoptimization

Autonomous schema markups based on intelligent computing for search engine optimization

Similar Items