The complexity and expressive power of limit Datalog

Motivated by applications in declarative data analysis, in this article, we study DatalogZ—an extension of Datalog with stratified negation and arithmetic functions over integers. This language is known to be undecidable, so we present the fragment of limit DatalogZ programs, which is powerful enoug...

Full description

Bibliographic Details
Main Authors: Kaminski, M, Kostylev, E, Cuenca Grau, B, Motik, B, Horrocks, I
Format: Journal article
Language:English
Published: Association for Computing Machinery 2021
_version_ 1797096126266998784
author Kaminski, M
Kostylev, E
Cuenca Grau, B
Motik, B
Horrocks, I
author_facet Kaminski, M
Kostylev, E
Cuenca Grau, B
Motik, B
Horrocks, I
author_sort Kaminski, M
collection OXFORD
description Motivated by applications in declarative data analysis, in this article, we study DatalogZ—an extension of Datalog with stratified negation and arithmetic functions over integers. This language is known to be undecidable, so we present the fragment of limit DatalogZ programs, which is powerful enough to naturally capture many important data analysis tasks. In limit DatalogZ, all intensional predicates with a numeric argument are limit predicates that keep maximal or minimal bounds on numeric values. We show that reasoning in limit DatalogZ is decidable if a linearity condition restricting the use of multiplication is satisfied. In particular, limit-linear DatalogZ is complete for Δ2EXP and captures Δ2P over ordered datasets in the sense of descriptive complexity. We also provide a comprehensive study of several fragments of limit-linear DatalogZ. We show that semi-positive limit-linear programs (i.e., programs where negation is allowed only in front of extensional atoms) capture coNP over ordered datasets; furthermore, reasoning becomes coNEXP-complete in combined and coNP-complete in data complexity, where the lower bounds hold already for negation-free programs. In order to satisfy the requirements of data-intensive applications, we also propose an additional stability requirement, which causes the complexity of reasoning to drop to EXP in combined and to P in data complexity, thus obtaining the same bounds as for usual Datalog. Finally, we compare our formalisms with the languages underpinning existing Datalog-based approaches for data analysis and show that core fragments of these languages can be encoded as limit programs; this allows us to transfer decidability and complexity upper bounds from limit programs to other formalisms. Therefore, our article provides a unified logical framework for declarative data analysis which can be used as a basis for understanding the impact on expressive power and computational complexity of the key constructs available in existing languages.
first_indexed 2024-03-07T04:37:36Z
format Journal article
id oxford-uuid:d07e93d8-c502-4501-80eb-befd6e6d8f61
institution University of Oxford
language English
last_indexed 2024-03-07T04:37:36Z
publishDate 2021
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:d07e93d8-c502-4501-80eb-befd6e6d8f612022-03-27T07:50:21ZThe complexity and expressive power of limit DatalogJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:d07e93d8-c502-4501-80eb-befd6e6d8f61EnglishSymplectic ElementsAssociation for Computing Machinery2021Kaminski, MKostylev, ECuenca Grau, BMotik, BHorrocks, IMotivated by applications in declarative data analysis, in this article, we study DatalogZ—an extension of Datalog with stratified negation and arithmetic functions over integers. This language is known to be undecidable, so we present the fragment of limit DatalogZ programs, which is powerful enough to naturally capture many important data analysis tasks. In limit DatalogZ, all intensional predicates with a numeric argument are limit predicates that keep maximal or minimal bounds on numeric values. We show that reasoning in limit DatalogZ is decidable if a linearity condition restricting the use of multiplication is satisfied. In particular, limit-linear DatalogZ is complete for Δ2EXP and captures Δ2P over ordered datasets in the sense of descriptive complexity. We also provide a comprehensive study of several fragments of limit-linear DatalogZ. We show that semi-positive limit-linear programs (i.e., programs where negation is allowed only in front of extensional atoms) capture coNP over ordered datasets; furthermore, reasoning becomes coNEXP-complete in combined and coNP-complete in data complexity, where the lower bounds hold already for negation-free programs. In order to satisfy the requirements of data-intensive applications, we also propose an additional stability requirement, which causes the complexity of reasoning to drop to EXP in combined and to P in data complexity, thus obtaining the same bounds as for usual Datalog. Finally, we compare our formalisms with the languages underpinning existing Datalog-based approaches for data analysis and show that core fragments of these languages can be encoded as limit programs; this allows us to transfer decidability and complexity upper bounds from limit programs to other formalisms. Therefore, our article provides a unified logical framework for declarative data analysis which can be used as a basis for understanding the impact on expressive power and computational complexity of the key constructs available in existing languages.
spellingShingle Kaminski, M
Kostylev, E
Cuenca Grau, B
Motik, B
Horrocks, I
The complexity and expressive power of limit Datalog
title The complexity and expressive power of limit Datalog
title_full The complexity and expressive power of limit Datalog
title_fullStr The complexity and expressive power of limit Datalog
title_full_unstemmed The complexity and expressive power of limit Datalog
title_short The complexity and expressive power of limit Datalog
title_sort complexity and expressive power of limit datalog
work_keys_str_mv AT kaminskim thecomplexityandexpressivepoweroflimitdatalog
AT kostyleve thecomplexityandexpressivepoweroflimitdatalog
AT cuencagraub thecomplexityandexpressivepoweroflimitdatalog
AT motikb thecomplexityandexpressivepoweroflimitdatalog
AT horrocksi thecomplexityandexpressivepoweroflimitdatalog
AT kaminskim complexityandexpressivepoweroflimitdatalog
AT kostyleve complexityandexpressivepoweroflimitdatalog
AT cuencagraub complexityandexpressivepoweroflimitdatalog
AT motikb complexityandexpressivepoweroflimitdatalog
AT horrocksi complexityandexpressivepoweroflimitdatalog