Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data

Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases whe...

Full description

Bibliographic Details
Main Authors: Alfred Ultsch, Jörn Lötsch
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/23/22/14081
_version_ 1797465122386477056
author Alfred Ultsch
Jörn Lötsch
author_facet Alfred Ultsch
Jörn Lötsch
author_sort Alfred Ultsch
collection DOAJ
description Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (<i>p</i> < <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.
first_indexed 2024-03-09T18:16:59Z
format Article
id doaj.art-bda1b0126fb54ea0878fcd4d38563c82
institution Directory Open Access Journal
issn 1661-6596
1422-0067
language English
last_indexed 2024-03-09T18:16:59Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series International Journal of Molecular Sciences
spelling doaj.art-bda1b0126fb54ea0878fcd4d38563c822023-11-24T08:38:36ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-11-0123221408110.3390/ijms232214081Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical DataAlfred Ultsch0Jörn Lötsch1DataBionics Research Group, University of Marburg, Hans-Meerwein-Straße 22, 35032 Marburg, GermanyInstitute of Clinical Pharmacology, Goethe-University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, GermanyBayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (<i>p</i> < <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.https://www.mdpi.com/1422-0067/23/22/14081data scienceartificial intelligencemachine learningdigital medicine
spellingShingle Alfred Ultsch
Jörn Lötsch
Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
International Journal of Molecular Sciences
data science
artificial intelligence
machine learning
digital medicine
title Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
title_full Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
title_fullStr Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
title_full_unstemmed Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
title_short Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
title_sort robust classification using posterior probability threshold computation followed by voronoi cell based class assignment circumventing pitfalls of bayesian analysis of biomedical data
topic data science
artificial intelligence
machine learning
digital medicine
url https://www.mdpi.com/1422-0067/23/22/14081
work_keys_str_mv AT alfredultsch robustclassificationusingposteriorprobabilitythresholdcomputationfollowedbyvoronoicellbasedclassassignmentcircumventingpitfallsofbayesiananalysisofbiomedicaldata
AT jornlotsch robustclassificationusingposteriorprobabilitythresholdcomputationfollowedbyvoronoicellbasedclassassignmentcircumventingpitfallsofbayesiananalysisofbiomedicaldata