Optimal clustering under uncertainty.

Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of...

Full description

Bibliographic Details
Main Authors: Lori A Dalton, Marco E Benalcázar, Edward R Dougherty
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC6168142?pdf=render
_version_ 1819067932103147520
author Lori A Dalton
Marco E Benalcázar
Edward R Dougherty
author_facet Lori A Dalton
Marco E Benalcázar
Edward R Dougherty
author_sort Lori A Dalton
collection DOAJ
description Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.
first_indexed 2024-12-21T16:26:06Z
format Article
id doaj.art-732e553d17dd4b3386a01ab6fe7cc915
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-21T16:26:06Z
publishDate 2018-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-732e553d17dd4b3386a01ab6fe7cc9152022-12-21T18:57:26ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-011310e020462710.1371/journal.pone.0204627Optimal clustering under uncertainty.Lori A DaltonMarco E BenalcázarEdward R DoughertyClassical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.http://europepmc.org/articles/PMC6168142?pdf=render
spellingShingle Lori A Dalton
Marco E Benalcázar
Edward R Dougherty
Optimal clustering under uncertainty.
PLoS ONE
title Optimal clustering under uncertainty.
title_full Optimal clustering under uncertainty.
title_fullStr Optimal clustering under uncertainty.
title_full_unstemmed Optimal clustering under uncertainty.
title_short Optimal clustering under uncertainty.
title_sort optimal clustering under uncertainty
url http://europepmc.org/articles/PMC6168142?pdf=render
work_keys_str_mv AT loriadalton optimalclusteringunderuncertainty
AT marcoebenalcazar optimalclusteringunderuncertainty
AT edwardrdougherty optimalclusteringunderuncertainty