Automata Approach to XML Data Indexing

The internal structure of XML documents can be viewed as a tree. Trees are among the fundamental and well-studied data structures in computer science. They express a hierarchical structure and are widely used in many applications. This paper focuses on the problem of processing tree data structures;...

Full description

Bibliographic Details
Main Authors: Eliška Šestáková, Jan Janoušek
Format: Article
Language:English
Published: MDPI AG 2018-01-01
Series:Information
Subjects:
Online Access:http://www.mdpi.com/2078-2489/9/1/12
_version_ 1818530089864790016
author Eliška Šestáková
Jan Janoušek
author_facet Eliška Šestáková
Jan Janoušek
author_sort Eliška Šestáková
collection DOAJ
description The internal structure of XML documents can be viewed as a tree. Trees are among the fundamental and well-studied data structures in computer science. They express a hierarchical structure and are widely used in many applications. This paper focuses on the problem of processing tree data structures; particularly, it studies the XML index problem. Although there exist many state-of-the-art methods, the XML index problem still belongs to the active research areas. However, existing methods usually lack clear references to a systematic approach to the standard theory of formal languages and automata. Therefore, we present some new methods solving the XML index problem using the automata theory. These methods are simple and allow one to efficiently process a small subset of XPath. Thus, having an XML data structure, our methods can be used efficiently as auxiliary data structures that enable answering a particular set of queries, e.g., XPath queries using any combination of the child and descendant-or-self axes. Given an XML tree model with n nodes, the searching phase uses the index, reads an input query of size m, finds the answer in time O ( m ) and does not depend on the size of the original XML document.
first_indexed 2024-12-11T17:15:20Z
format Article
id doaj.art-613d0a2fe963431e8f198a2a35758c86
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-12-11T17:15:20Z
publishDate 2018-01-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-613d0a2fe963431e8f198a2a35758c862022-12-22T00:57:22ZengMDPI AGInformation2078-24892018-01-01911210.3390/info9010012info9010012Automata Approach to XML Data IndexingEliška Šestáková0Jan Janoušek1Faculty of Information Technology, Czech Technical University in Prague, Thákurova 9, 160 00 Praha 6, Czech RepublicFaculty of Information Technology, Czech Technical University in Prague, Thákurova 9, 160 00 Praha 6, Czech RepublicThe internal structure of XML documents can be viewed as a tree. Trees are among the fundamental and well-studied data structures in computer science. They express a hierarchical structure and are widely used in many applications. This paper focuses on the problem of processing tree data structures; particularly, it studies the XML index problem. Although there exist many state-of-the-art methods, the XML index problem still belongs to the active research areas. However, existing methods usually lack clear references to a systematic approach to the standard theory of formal languages and automata. Therefore, we present some new methods solving the XML index problem using the automata theory. These methods are simple and allow one to efficiently process a small subset of XPath. Thus, having an XML data structure, our methods can be used efficiently as auxiliary data structures that enable answering a particular set of queries, e.g., XPath queries using any combination of the child and descendant-or-self axes. Given an XML tree model with n nodes, the searching phase uses the index, reads an input query of size m, finds the answer in time O ( m ) and does not depend on the size of the original XML document.http://www.mdpi.com/2078-2489/9/1/12XMLXPathindexindexingtreeautomatonfinite state automatonfinite state machine
spellingShingle Eliška Šestáková
Jan Janoušek
Automata Approach to XML Data Indexing
Information
XML
XPath
index
indexing
tree
automaton
finite state automaton
finite state machine
title Automata Approach to XML Data Indexing
title_full Automata Approach to XML Data Indexing
title_fullStr Automata Approach to XML Data Indexing
title_full_unstemmed Automata Approach to XML Data Indexing
title_short Automata Approach to XML Data Indexing
title_sort automata approach to xml data indexing
topic XML
XPath
index
indexing
tree
automaton
finite state automaton
finite state machine
url http://www.mdpi.com/2078-2489/9/1/12
work_keys_str_mv AT eliskasestakova automataapproachtoxmldataindexing
AT janjanousek automataapproachtoxmldataindexing