Integrity Constraints Revisited: From Exact to Approximate Implication

Integrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whe...

Full description

Bibliographic Details
Main Authors:	Batya Kenig, Dan Suciu
Format:	Article
Language:	English
Published:	Logical Methods in Computer Science e.V. 2022-01-01
Series:	Logical Methods in Computer Science
Subjects:	computer science - databases
Online Access:	https://lmcs.episciences.org/6925/pdf

_version_	1797268468233404416
author	Batya Kenig Dan Suciu
author_facet	Batya Kenig Dan Suciu
author_sort	Batya Kenig
collection	DOAJ
description	Integrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Then, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Finally, we show how some of the results in the paper can be derived using the I-measure theory, which relates between information theoretic measures and set theory. Our results recover, and sometimes extend, previously known results about the implication problem: the implication of MVDs and FDs can be checked by considering only 2-tuple relations.
first_indexed	2024-04-25T01:32:57Z
format	Article
id	doaj.art-04e2ad0d45d34300a58907bffcd4e253
institution	Directory Open Access Journal
issn	1860-5974
language	English
last_indexed	2024-04-25T01:32:57Z
publishDate	2022-01-01
publisher	Logical Methods in Computer Science e.V.
record_format	Article
series	Logical Methods in Computer Science
spelling	doaj.art-04e2ad0d45d34300a58907bffcd4e2532024-03-08T10:36:53ZengLogical Methods in Computer Science e.V.Logical Methods in Computer Science1860-59742022-01-01Volume 18, Issue 110.46298/lmcs-18(1:5)20226925Integrity Constraints Revisited: From Exact to Approximate ImplicationBatya KenigDan SuciuIntegrity constraints such as functional dependencies (FD) and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Then, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Finally, we show how some of the results in the paper can be derived using the I-measure theory, which relates between information theoretic measures and set theory. Our results recover, and sometimes extend, previously known results about the implication problem: the implication of MVDs and FDs can be checked by considering only 2-tuple relations.https://lmcs.episciences.org/6925/pdfcomputer science - databases
spellingShingle	Batya Kenig Dan Suciu Integrity Constraints Revisited: From Exact to Approximate Implication Logical Methods in Computer Science computer science - databases
title	Integrity Constraints Revisited: From Exact to Approximate Implication
title_full	Integrity Constraints Revisited: From Exact to Approximate Implication
title_fullStr	Integrity Constraints Revisited: From Exact to Approximate Implication
title_full_unstemmed	Integrity Constraints Revisited: From Exact to Approximate Implication
title_short	Integrity Constraints Revisited: From Exact to Approximate Implication
title_sort	integrity constraints revisited from exact to approximate implication
topic	computer science - databases
url	https://lmcs.episciences.org/6925/pdf
work_keys_str_mv	AT batyakenig integrityconstraintsrevisitedfromexacttoapproximateimplication AT dansuciu integrityconstraintsrevisitedfromexacttoapproximateimplication

Integrity Constraints Revisited: From Exact to Approximate Implication

Similar Items