Convergence Results for the EM Approach to Mixtures of Experts Architectures

The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of exper...

Full description

Bibliographic Details
Main Authors: Jordan, Michael I., Xu, Lei
Language:en_US
Published: 2004
Online Access:http://hdl.handle.net/1721.1/6620
_version_ 1811002331673657344
author Jordan, Michael I.
Xu, Lei
author_facet Jordan, Michael I.
Xu, Lei
author_sort Jordan, Michael I.
collection MIT
description The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.
first_indexed 2024-09-23T15:46:50Z
id mit-1721.1/6620
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:46:50Z
publishDate 2004
record_format dspace
spelling mit-1721.1/66202019-04-11T02:52:34Z Convergence Results for the EM Approach to Mixtures of Experts Architectures Jordan, Michael I. Xu, Lei The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments. 2004-10-08T20:34:35Z 2004-10-08T20:34:35Z 1993-11-01 AIM-1458 http://hdl.handle.net/1721.1/6620 en_US AIM-1458 245749 bytes 829871 bytes application/octet-stream application/pdf application/octet-stream application/pdf
spellingShingle Jordan, Michael I.
Xu, Lei
Convergence Results for the EM Approach to Mixtures of Experts Architectures
title Convergence Results for the EM Approach to Mixtures of Experts Architectures
title_full Convergence Results for the EM Approach to Mixtures of Experts Architectures
title_fullStr Convergence Results for the EM Approach to Mixtures of Experts Architectures
title_full_unstemmed Convergence Results for the EM Approach to Mixtures of Experts Architectures
title_short Convergence Results for the EM Approach to Mixtures of Experts Architectures
title_sort convergence results for the em approach to mixtures of experts architectures
url http://hdl.handle.net/1721.1/6620
work_keys_str_mv AT jordanmichaeli convergenceresultsfortheemapproachtomixturesofexpertsarchitectures
AT xulei convergenceresultsfortheemapproachtomixturesofexpertsarchitectures