Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data
Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases whe...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-11-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/1422-0067/23/22/14081 |
_version_ | 1797465122386477056 |
---|---|
author | Alfred Ultsch Jörn Lötsch |
author_facet | Alfred Ultsch Jörn Lötsch |
author_sort | Alfred Ultsch |
collection | DOAJ |
description | Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (<i>p</i> < <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak. |
first_indexed | 2024-03-09T18:16:59Z |
format | Article |
id | doaj.art-bda1b0126fb54ea0878fcd4d38563c82 |
institution | Directory Open Access Journal |
issn | 1661-6596 1422-0067 |
language | English |
last_indexed | 2024-03-09T18:16:59Z |
publishDate | 2022-11-01 |
publisher | MDPI AG |
record_format | Article |
series | International Journal of Molecular Sciences |
spelling | doaj.art-bda1b0126fb54ea0878fcd4d38563c822023-11-24T08:38:36ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672022-11-0123221408110.3390/ijms232214081Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical DataAlfred Ultsch0Jörn Lötsch1DataBionics Research Group, University of Marburg, Hans-Meerwein-Straße 22, 35032 Marburg, GermanyInstitute of Clinical Pharmacology, Goethe-University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, GermanyBayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (<i>p</i> < <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ε</mi></semantics></math></inline-formula>). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.https://www.mdpi.com/1422-0067/23/22/14081data scienceartificial intelligencemachine learningdigital medicine |
spellingShingle | Alfred Ultsch Jörn Lötsch Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data International Journal of Molecular Sciences data science artificial intelligence machine learning digital medicine |
title | Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data |
title_full | Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data |
title_fullStr | Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data |
title_full_unstemmed | Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data |
title_short | Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data |
title_sort | robust classification using posterior probability threshold computation followed by voronoi cell based class assignment circumventing pitfalls of bayesian analysis of biomedical data |
topic | data science artificial intelligence machine learning digital medicine |
url | https://www.mdpi.com/1422-0067/23/22/14081 |
work_keys_str_mv | AT alfredultsch robustclassificationusingposteriorprobabilitythresholdcomputationfollowedbyvoronoicellbasedclassassignmentcircumventingpitfallsofbayesiananalysisofbiomedicaldata AT jornlotsch robustclassificationusingposteriorprobabilitythresholdcomputationfollowedbyvoronoicellbasedclassassignmentcircumventingpitfallsofbayesiananalysisofbiomedicaldata |