Learning Mixed Multinomial Logit Models

Multinomial logit (MNL) model is widely used to predict the probabilities of different outcomes. However, standard MNL model suffers from several issues, including but not limited to heterogeneous population, the restricted independence of irrelevant alternative (IIA) assumption, insufficient model...

Full description

Bibliographic Details
Main Author: Hu, Yiqun
Other Authors: Simchi-Levi, David
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150428
_version_ 1826210138045808640
author Hu, Yiqun
author2 Simchi-Levi, David
author_facet Simchi-Levi, David
Hu, Yiqun
author_sort Hu, Yiqun
collection MIT
description Multinomial logit (MNL) model is widely used to predict the probabilities of different outcomes. However, standard MNL model suffers from several issues, including but not limited to heterogeneous population, the restricted independence of irrelevant alternative (IIA) assumption, insufficient model capacity, etc. To alleviate these issues, mixed multinomial logit (MMNL) models were introduced. MMNL models are highly flexible. McFadden and Train [2000] showed that it can approximate any random utility based discrete choice models to arbitrary degree of accuracy under appropriate assumptions. In addition, it removes other limitations of standard MNL models, including lifting the IIA assumption, allowing correlation in unobserved utility factors overtime, and most importantly, reducing the chance of model misspecification when modeling real world applications where the data composition is often found to be heterogeneous. Despite its importance and versatility, the study on the learning theory of MMNL is limited and learning MMNL models remains an open research topic. In this thesis, we will tackle this learning problem from two different perspectives. First, inspired by the recent work in Gaussian Mixture Models (GMM), we aim to explore the polynomial learnability of MMNL models from a theoretical point of view. Next, we present an algorithm that is designed to be more applicable and utilizes the rich source of data available in the modern digitalization era, yet still yielding ideal statistical properties of the estimators. Chapter 2 studies the polynomial learnability of MMNL models with a general K number of mixtures. This work aims to extend the current results that only apply to 2-MNL models. We analyze the existence of ϵ-close estimates using tools from abstract algebra and will show that there exists an algorithm that can learn a general K-MNL models with probability at least 1−δ, if identifiable, using polynomial number of data samples and polynomial number of operations (in 1 ϵ and 1 δ ), under some reasonable assumptions. In Chapter 3, motivated by the Frank-Wolfe (FW) algorithm, we propose a framework that learns both mixture weights and component-specific logit parameters with provable convergence guarantees for arbitrary number of mixtures. Our algorithm utilizes historical choice data to generate a set of candidate choice probability vectors, each being ϵ-close to the ground truth with high probability. The convex hull of this set forms a shrunken feasible region with desired properties to the linear subproblems in FW, which subsequently enables independent parameter estimation within each mixture and in turn, leads to convergence of the mixture weights. This framework also resolves the issue of unboundedness in estimated parameters present in the original FW approach. Complexity analysis shows that only a polynomial number of samples is required for each candidate in the target population. Extensive numerical experiments are conducted in Chapter 4, including both simulation and case studies on the well-known Nielsen Consumer Panel Data, to demonstrate the effectiveness of recovering the true model parameters and/or learning realistic component-level parameters, as compared to the original FW framework.
first_indexed 2024-09-23T14:44:31Z
format Thesis
id mit-1721.1/150428
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:44:31Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1504282023-04-07T03:14:11Z Learning Mixed Multinomial Logit Models Hu, Yiqun Simchi-Levi, David Massachusetts Institute of Technology. Department of Civil and Environmental Engineering Multinomial logit (MNL) model is widely used to predict the probabilities of different outcomes. However, standard MNL model suffers from several issues, including but not limited to heterogeneous population, the restricted independence of irrelevant alternative (IIA) assumption, insufficient model capacity, etc. To alleviate these issues, mixed multinomial logit (MMNL) models were introduced. MMNL models are highly flexible. McFadden and Train [2000] showed that it can approximate any random utility based discrete choice models to arbitrary degree of accuracy under appropriate assumptions. In addition, it removes other limitations of standard MNL models, including lifting the IIA assumption, allowing correlation in unobserved utility factors overtime, and most importantly, reducing the chance of model misspecification when modeling real world applications where the data composition is often found to be heterogeneous. Despite its importance and versatility, the study on the learning theory of MMNL is limited and learning MMNL models remains an open research topic. In this thesis, we will tackle this learning problem from two different perspectives. First, inspired by the recent work in Gaussian Mixture Models (GMM), we aim to explore the polynomial learnability of MMNL models from a theoretical point of view. Next, we present an algorithm that is designed to be more applicable and utilizes the rich source of data available in the modern digitalization era, yet still yielding ideal statistical properties of the estimators. Chapter 2 studies the polynomial learnability of MMNL models with a general K number of mixtures. This work aims to extend the current results that only apply to 2-MNL models. We analyze the existence of ϵ-close estimates using tools from abstract algebra and will show that there exists an algorithm that can learn a general K-MNL models with probability at least 1−δ, if identifiable, using polynomial number of data samples and polynomial number of operations (in 1 ϵ and 1 δ ), under some reasonable assumptions. In Chapter 3, motivated by the Frank-Wolfe (FW) algorithm, we propose a framework that learns both mixture weights and component-specific logit parameters with provable convergence guarantees for arbitrary number of mixtures. Our algorithm utilizes historical choice data to generate a set of candidate choice probability vectors, each being ϵ-close to the ground truth with high probability. The convex hull of this set forms a shrunken feasible region with desired properties to the linear subproblems in FW, which subsequently enables independent parameter estimation within each mixture and in turn, leads to convergence of the mixture weights. This framework also resolves the issue of unboundedness in estimated parameters present in the original FW approach. Complexity analysis shows that only a polynomial number of samples is required for each candidate in the target population. Extensive numerical experiments are conducted in Chapter 4, including both simulation and case studies on the well-known Nielsen Consumer Panel Data, to demonstrate the effectiveness of recovering the true model parameters and/or learning realistic component-level parameters, as compared to the original FW framework. Ph.D. 2023-04-06T14:31:54Z 2023-04-06T14:31:54Z 2022-09 2023-03-30T17:18:30.842Z Thesis https://hdl.handle.net/1721.1/150428 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Hu, Yiqun
Learning Mixed Multinomial Logit Models
title Learning Mixed Multinomial Logit Models
title_full Learning Mixed Multinomial Logit Models
title_fullStr Learning Mixed Multinomial Logit Models
title_full_unstemmed Learning Mixed Multinomial Logit Models
title_short Learning Mixed Multinomial Logit Models
title_sort learning mixed multinomial logit models
url https://hdl.handle.net/1721.1/150428
work_keys_str_mv AT huyiqun learningmixedmultinomiallogitmodels