An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
Abstract We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness co...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Springer Berlin Heidelberg
2021
|
Online Access: | https://hdl.handle.net/1721.1/131361 |
_version_ | 1811072679201996800 |
---|---|
author | Hosseini, Reshad Sra, Suvrit |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Hosseini, Reshad Sra, Suvrit |
author_sort | Hosseini, Reshad |
collection | MIT |
description | Abstract
We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods. |
first_indexed | 2024-09-23T09:10:07Z |
format | Article |
id | mit-1721.1/131361 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T09:10:07Z |
publishDate | 2021 |
publisher | Springer Berlin Heidelberg |
record_format | dspace |
spelling | mit-1721.1/1313612023-12-08T20:50:32Z An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization Hosseini, Reshad Sra, Suvrit Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Institute for Data, Systems, and Society Abstract We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods. 2021-09-20T17:16:44Z 2021-09-20T17:16:44Z 2019-03-19 2020-09-24T21:02:27Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/131361 en https://doi.org/10.1007/s10107-019-01381-4 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society application/pdf Springer Berlin Heidelberg Springer Berlin Heidelberg |
spellingShingle | Hosseini, Reshad Sra, Suvrit An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title | An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title_full | An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title_fullStr | An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title_full_unstemmed | An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title_short | An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization |
title_sort | alternative to em for gaussian mixture models batch and stochastic riemannian optimization |
url | https://hdl.handle.net/1721.1/131361 |
work_keys_str_mv | AT hosseinireshad analternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization AT srasuvrit analternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization AT hosseinireshad alternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization AT srasuvrit alternativetoemforgaussianmixturemodelsbatchandstochasticriemannianoptimization |