New data-driven approaches to improve probabilistic model structure learning

To learn the network structures used in probabilistic models (e.g., Bayesian network), many researchers proposed structure learning algorithms to extract the network structure from data. However, structure learning is a challenging problem due to the extremely large number of possible structure cand...

Full description

Bibliographic Details
Main Author: Zhao, Jianjun
Other Authors: Pan Jialin, Sinno
Format: Thesis
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/84123
http://hdl.handle.net/10220/50443
_version_ 1811695367300841472
author Zhao, Jianjun
author2 Pan Jialin, Sinno
author_facet Pan Jialin, Sinno
Zhao, Jianjun
author_sort Zhao, Jianjun
collection NTU
description To learn the network structures used in probabilistic models (e.g., Bayesian network), many researchers proposed structure learning algorithms to extract the network structure from data. However, structure learning is a challenging problem due to the extremely large number of possible structure candidates. One challenge relates to structure learning in Bayesian network is the conflicts among local structures obtained from the local structure learning algorithms. This is the so-called symmetry correction problem. Another challenge is the V-structure selection problem, which is related to the determination of edge orientation in Bayesian network. In this thesis, we investigate the above two challenges in structure learning and propose novel data-driven approaches to overcome these challenges when building a Bayesian network. First, two new data-driven symmetry correction methods are developed to learn an undirected graph of Bayesian network. The proposed methods outperform the existing heuristic rule. Second, a weighted maximum satisfiability (MAX-SAT) problem is formulated to solve the V-structures selection problem. The weights are learned from data to quantify the strength of the V-structures. Our proposed solution outperforms existing methods. Besides, we investigate how transfer learning can be used for structure learning with limited training examples and a source structure. In particular, we propose a transfer learning approach to learn the structure of a Sum-Product Network (SPN) which can be converted to a Bayesian network under certain conditions. Our novel approach allows one to construct the target SPN with limited training examples, given an existing source SPN from a similar domain.
first_indexed 2024-10-01T07:22:20Z
format Thesis
id ntu-10356/84123
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:22:20Z
publishDate 2019
record_format dspace
spelling ntu-10356/841232020-10-28T08:40:48Z New data-driven approaches to improve probabilistic model structure learning Zhao, Jianjun Pan Jialin, Sinno School of Computer Science and Engineering A*STAR (SINGA) Centre for Computational Intelligence Science::Mathematics::Statistics Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence To learn the network structures used in probabilistic models (e.g., Bayesian network), many researchers proposed structure learning algorithms to extract the network structure from data. However, structure learning is a challenging problem due to the extremely large number of possible structure candidates. One challenge relates to structure learning in Bayesian network is the conflicts among local structures obtained from the local structure learning algorithms. This is the so-called symmetry correction problem. Another challenge is the V-structure selection problem, which is related to the determination of edge orientation in Bayesian network. In this thesis, we investigate the above two challenges in structure learning and propose novel data-driven approaches to overcome these challenges when building a Bayesian network. First, two new data-driven symmetry correction methods are developed to learn an undirected graph of Bayesian network. The proposed methods outperform the existing heuristic rule. Second, a weighted maximum satisfiability (MAX-SAT) problem is formulated to solve the V-structures selection problem. The weights are learned from data to quantify the strength of the V-structures. Our proposed solution outperforms existing methods. Besides, we investigate how transfer learning can be used for structure learning with limited training examples and a source structure. In particular, we propose a transfer learning approach to learn the structure of a Sum-Product Network (SPN) which can be converted to a Bayesian network under certain conditions. Our novel approach allows one to construct the target SPN with limited training examples, given an existing source SPN from a similar domain. Doctor of Philosophy 2019-11-19T12:07:54Z 2019-12-06T15:38:48Z 2019-11-19T12:07:54Z 2019-12-06T15:38:48Z 2019 Thesis Zhao, J. (2019). New data-driven approaches to improve probabilistic model structure learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/84123 http://hdl.handle.net/10220/50443 10.32657/10356/84123 en 125 p. application/pdf
spellingShingle Science::Mathematics::Statistics
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Zhao, Jianjun
New data-driven approaches to improve probabilistic model structure learning
title New data-driven approaches to improve probabilistic model structure learning
title_full New data-driven approaches to improve probabilistic model structure learning
title_fullStr New data-driven approaches to improve probabilistic model structure learning
title_full_unstemmed New data-driven approaches to improve probabilistic model structure learning
title_short New data-driven approaches to improve probabilistic model structure learning
title_sort new data driven approaches to improve probabilistic model structure learning
topic Science::Mathematics::Statistics
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
url https://hdl.handle.net/10356/84123
http://hdl.handle.net/10220/50443
work_keys_str_mv AT zhaojianjun newdatadrivenapproachestoimproveprobabilisticmodelstructurelearning