Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package

It is well known in the literature that the problem of learning the structure of Bayesian networks is very hard to tackle: Its computational complexity is super-exponential in the number of nodes in the worst case and polynomial in most real-world scenarios.Efficient implementations of score-based s...

Full description

Bibliographic Details
Main Author:	Scutari, M
Format:	Journal article
Published:	Foundation for Open Access Statistics 2017

_version_	1826269777939660800
author	Scutari, M
author_facet	Scutari, M
author_sort	Scutari, M
collection	OXFORD
description	It is well known in the literature that the problem of learning the structure of Bayesian networks is very hard to tackle: Its computational complexity is super-exponential in the number of nodes in the worst case and polynomial in most real-world scenarios.Efficient implementations of score-based structure learning benefit from past and current research in optimization theory, which can be adapted to the task by using the network score as the objective function to maximize. This is not true for approaches based on conditional independence tests, called constraint-based learning algorithms. The only optimization in widespread use, backtracking, leverages the symmetries implied by the definitions of neighborhood and Markov blanket. In this paper we illustrate how backtracking is implemented in recent versions of the bnlearn R package, and how it degrades the stability of Bayesian network structure learning for little gain in terms of speed. As an alternative, we describe a software architecture and framework that can be used to parallelize constraint-based structure learning algorithms (also implemented in bnlearn) and we demonstrate its performance using four reference networks and two real-world data sets from genetics and systems biology. We show that on modern multi-core or multiprocessor hardware parallel implementations are preferable over backtracking, which was developed when single-processor machines were the norm.
first_indexed	2024-03-06T21:30:27Z
format	Journal article
id	oxford-uuid:44856555-4884-4ccd-8ebf-6b5aba5b11b0
institution	University of Oxford
last_indexed	2024-03-06T21:30:27Z
publishDate	2017
publisher	Foundation for Open Access Statistics
record_format	dspace
spelling	oxford-uuid:44856555-4884-4ccd-8ebf-6b5aba5b11b02022-03-26T15:02:03ZBayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R PackageJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:44856555-4884-4ccd-8ebf-6b5aba5b11b0Symplectic Elements at OxfordFoundation for Open Access Statistics2017Scutari, MIt is well known in the literature that the problem of learning the structure of Bayesian networks is very hard to tackle: Its computational complexity is super-exponential in the number of nodes in the worst case and polynomial in most real-world scenarios.Efficient implementations of score-based structure learning benefit from past and current research in optimization theory, which can be adapted to the task by using the network score as the objective function to maximize. This is not true for approaches based on conditional independence tests, called constraint-based learning algorithms. The only optimization in widespread use, backtracking, leverages the symmetries implied by the definitions of neighborhood and Markov blanket. In this paper we illustrate how backtracking is implemented in recent versions of the bnlearn R package, and how it degrades the stability of Bayesian network structure learning for little gain in terms of speed. As an alternative, we describe a software architecture and framework that can be used to parallelize constraint-based structure learning algorithms (also implemented in bnlearn) and we demonstrate its performance using four reference networks and two real-world data sets from genetics and systems biology. We show that on modern multi-core or multiprocessor hardware parallel implementations are preferable over backtracking, which was developed when single-processor machines were the norm.
spellingShingle	Scutari, M Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title	Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title_full	Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title_fullStr	Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title_full_unstemmed	Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title_short	Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package
title_sort	bayesian network constraint based structure learning algorithms parallel and optimized implementations in the bnlearn r package
work_keys_str_mv	AT scutarim bayesiannetworkconstraintbasedstructurelearningalgorithmsparallelandoptimizedimplementationsinthebnlearnrpackage

Bayesian network constraint-based structure learning algorithms: parallel and optimized implementations in the bnlearn R Package

Similar Items