Bayesian Network Model Averaging Classifiers by Subbagging

When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achi...

Full description

Bibliographic Details
Main Authors: Shouta Sugahara, Itsuki Aomi, Maomi Ueno
Format: Article
Language:English
Published: MDPI AG 2022-05-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/24/5/743
_version_ 1827669047574003712
author Shouta Sugahara
Itsuki Aomi
Maomi Ueno
author_facet Shouta Sugahara
Itsuki Aomi
Maomi Ueno
author_sort Shouta Sugahara
collection DOAJ
description When applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved by maximizing the conditional log likelihood (CLL) of a class variable given the feature variables. Nevertheless, because ML has asymptotic consistency, the performance of Bayesian network structures achieved by maximizing ML is not necessarily worse than that achieved by maximizing CLL for large data. However, the error of learning structures by maximizing the ML becomes much larger for small sample sizes. That large error degrades the classification accuracy. As a method to resolve this shortcoming, model averaging has been proposed to marginalize the class variable posterior over all structures. However, the posterior standard error of each structure in the model averaging becomes large as the sample size becomes small; it subsequently degrades the classification accuracy. The main idea of this study is to improve the classification accuracy using subbagging, which is modified bagging using random sampling without replacement, to reduce the posterior standard error of each structure in model averaging. Moreover, to guarantee asymptotic consistency, we use the <i>K</i>-best method with the ML score. The experimentally obtained results demonstrate that our proposed method provides more accurate classification than earlier BNC methods and the other state-of-the-art ensemble methods do.
first_indexed 2024-03-10T03:55:02Z
format Article
id doaj.art-4d0d46725506407dbb4cebd3bc6606ba
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-10T03:55:02Z
publishDate 2022-05-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-4d0d46725506407dbb4cebd3bc6606ba2023-11-23T10:56:33ZengMDPI AGEntropy1099-43002022-05-0124574310.3390/e24050743Bayesian Network Model Averaging Classifiers by SubbaggingShouta Sugahara0Itsuki Aomi1Maomi Ueno2Graduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi 182-8585, JapanSansan Inc., Tokyo 150-0001, JapanGraduate School of Informatics and Engineering, The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi 182-8585, JapanWhen applied to classification problems, Bayesian networks are often used to infer a class variable when given feature variables. Earlier reports have described that the classification accuracy of Bayesian network structures achieved by maximizing the marginal likelihood (ML) is lower than that achieved by maximizing the conditional log likelihood (CLL) of a class variable given the feature variables. Nevertheless, because ML has asymptotic consistency, the performance of Bayesian network structures achieved by maximizing ML is not necessarily worse than that achieved by maximizing CLL for large data. However, the error of learning structures by maximizing the ML becomes much larger for small sample sizes. That large error degrades the classification accuracy. As a method to resolve this shortcoming, model averaging has been proposed to marginalize the class variable posterior over all structures. However, the posterior standard error of each structure in the model averaging becomes large as the sample size becomes small; it subsequently degrades the classification accuracy. The main idea of this study is to improve the classification accuracy using subbagging, which is modified bagging using random sampling without replacement, to reduce the posterior standard error of each structure in model averaging. Moreover, to guarantee asymptotic consistency, we use the <i>K</i>-best method with the ML score. The experimentally obtained results demonstrate that our proposed method provides more accurate classification than earlier BNC methods and the other state-of-the-art ensemble methods do.https://www.mdpi.com/1099-4300/24/5/743Bayesian networksclassificationmodel averagingstructure learning
spellingShingle Shouta Sugahara
Itsuki Aomi
Maomi Ueno
Bayesian Network Model Averaging Classifiers by Subbagging
Entropy
Bayesian networks
classification
model averaging
structure learning
title Bayesian Network Model Averaging Classifiers by Subbagging
title_full Bayesian Network Model Averaging Classifiers by Subbagging
title_fullStr Bayesian Network Model Averaging Classifiers by Subbagging
title_full_unstemmed Bayesian Network Model Averaging Classifiers by Subbagging
title_short Bayesian Network Model Averaging Classifiers by Subbagging
title_sort bayesian network model averaging classifiers by subbagging
topic Bayesian networks
classification
model averaging
structure learning
url https://www.mdpi.com/1099-4300/24/5/743
work_keys_str_mv AT shoutasugahara bayesiannetworkmodelaveragingclassifiersbysubbagging
AT itsukiaomi bayesiannetworkmodelaveragingclassifiersbysubbagging
AT maomiueno bayesiannetworkmodelaveragingclassifiersbysubbagging