An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization

Abstract We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness co...

Full description

Bibliographic Details
Main Authors: Hosseini, Reshad, Sra, Suvrit
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Language:English
Published: Springer Berlin Heidelberg 2021
Online Access:https://hdl.handle.net/1721.1/131361
_version_ 1811072679201996800
author Hosseini, Reshad
Sra, Suvrit
author2 Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
author_facet Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Hosseini, Reshad
Sra, Suvrit
author_sort Hosseini, Reshad
collection MIT
description Abstract We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.
first_indexed 2024-09-23T09:10:07Z
format Article
id mit-1721.1/131361
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T09:10:07Z
publishDate 2021
publisher Springer Berlin Heidelberg
record_format dspace
spelling mit-1721.1/1313612023-12-08T20:50:32Z An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization Hosseini, Reshad Sra, Suvrit Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Institute for Data, Systems, and Society Abstract We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods. 2021-09-20T17:16:44Z 2021-09-20T17:16:44Z 2019-03-19 2020-09-24T21:02:27Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/131361 en https://doi.org/10.1007/s10107-019-01381-4 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society application/pdf Springer Berlin Heidelberg Springer Berlin Heidelberg
spellingShingle Hosseini, Reshad
Sra, Suvrit
An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title_full An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title_fullStr An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title_full_unstemmed An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title_short An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
title_sort alternative to em for gaussian mixture models batch and stochastic riemannian optimization
url https://hdl.handle.net/1721.1/131361
work_keys_str_mv AT hosseinireshad analternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization
AT srasuvrit analternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization
AT hosseinireshad alternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization
AT srasuvrit alternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization