Document enrichment using semantic tags for effective XML retrieval

Using XML to mark up document contents with user-defined and self descriptive terms makes XML technology as one of the most widely used technology for information representation and exchanges over the Internet. As a result many documents are now represented and stored as XML documents on the web. Th...

Full description

Bibliographic Details
Main Authors: Abubakar, Roko, C. Doraisamy, Shyamala, Azman, Azreen, Jantan, Azrul Hazri
Format: Article
Language:English
Published: Advanced Institute of Convergence Information Technology (AICIT) 2013
Online Access:http://psasir.upm.edu.my/id/eprint/30605/1/Document%20enrichment%20using%20semantic%20tags%20for%20effective%20XML%20retrieval.pdf
_version_ 1825947739709505536
author Abubakar, Roko
C. Doraisamy, Shyamala
Azman, Azreen
Jantan, Azrul Hazri
author_facet Abubakar, Roko
C. Doraisamy, Shyamala
Azman, Azreen
Jantan, Azrul Hazri
author_sort Abubakar, Roko
collection UPM
description Using XML to mark up document contents with user-defined and self descriptive terms makes XML technology as one of the most widely used technology for information representation and exchanges over the Internet. As a result many documents are now represented and stored as XML documents on the web. Therefore, there is the need to develop precise, efficient and user-friendly search techniques. The existing systems that support Content Only (CO) queries can be categorized into three. The Lowest Common Ancestor (LCA)-based, Query structuring systems and document Structure based systems. The answers return by first group of systems are either irrelevant to user search intention or may not be meaningful or informative enough because of the restriction on the choice of the root node. The other group requires mostly the existence of data scheme for its query conversion which is not always available or complex and fast evolving. Most of the existing systems put their emphases on query side. In this paper, we focus on document side instead of query side. Our approach exploits document structure; we enriched Wikipedia XML documents text with annotated semantic tags presence in the document. The effect of enriching elements’ text content is investigated through three retrieval experiments for which only the text content of document collection differ. The results of the experiments revealed that enriching elements’ text content with the semantic tags could improve the effectiveness of CO queries.
first_indexed 2024-03-06T08:18:04Z
format Article
id upm.eprints-30605
institution Universiti Putra Malaysia
language English
last_indexed 2024-03-06T08:18:04Z
publishDate 2013
publisher Advanced Institute of Convergence Information Technology (AICIT)
record_format dspace
spelling upm.eprints-306052015-09-09T06:07:08Z http://psasir.upm.edu.my/id/eprint/30605/ Document enrichment using semantic tags for effective XML retrieval Abubakar, Roko C. Doraisamy, Shyamala Azman, Azreen Jantan, Azrul Hazri Using XML to mark up document contents with user-defined and self descriptive terms makes XML technology as one of the most widely used technology for information representation and exchanges over the Internet. As a result many documents are now represented and stored as XML documents on the web. Therefore, there is the need to develop precise, efficient and user-friendly search techniques. The existing systems that support Content Only (CO) queries can be categorized into three. The Lowest Common Ancestor (LCA)-based, Query structuring systems and document Structure based systems. The answers return by first group of systems are either irrelevant to user search intention or may not be meaningful or informative enough because of the restriction on the choice of the root node. The other group requires mostly the existence of data scheme for its query conversion which is not always available or complex and fast evolving. Most of the existing systems put their emphases on query side. In this paper, we focus on document side instead of query side. Our approach exploits document structure; we enriched Wikipedia XML documents text with annotated semantic tags presence in the document. The effect of enriching elements’ text content is investigated through three retrieval experiments for which only the text content of document collection differ. The results of the experiments revealed that enriching elements’ text content with the semantic tags could improve the effectiveness of CO queries. Advanced Institute of Convergence Information Technology (AICIT) 2013 Article PeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/30605/1/Document%20enrichment%20using%20semantic%20tags%20for%20effective%20XML%20retrieval.pdf Abubakar, Roko and C. Doraisamy, Shyamala and Azman, Azreen and Jantan, Azrul Hazri (2013) Document enrichment using semantic tags for effective XML retrieval. International Journal of Advancements in Computing Technology, 5 (13). pp. 138-146. ISSN 2005-8039
spellingShingle Abubakar, Roko
C. Doraisamy, Shyamala
Azman, Azreen
Jantan, Azrul Hazri
Document enrichment using semantic tags for effective XML retrieval
title Document enrichment using semantic tags for effective XML retrieval
title_full Document enrichment using semantic tags for effective XML retrieval
title_fullStr Document enrichment using semantic tags for effective XML retrieval
title_full_unstemmed Document enrichment using semantic tags for effective XML retrieval
title_short Document enrichment using semantic tags for effective XML retrieval
title_sort document enrichment using semantic tags for effective xml retrieval
url http://psasir.upm.edu.my/id/eprint/30605/1/Document%20enrichment%20using%20semantic%20tags%20for%20effective%20XML%20retrieval.pdf
work_keys_str_mv AT abubakarroko documentenrichmentusingsemantictagsforeffectivexmlretrieval
AT cdoraisamyshyamala documentenrichmentusingsemantictagsforeffectivexmlretrieval
AT azmanazreen documentenrichmentusingsemantictagsforeffectivexmlretrieval
AT jantanazrulhazri documentenrichmentusingsemantictagsforeffectivexmlretrieval