On The Performance Of The Maximum Likelihood Over Large Models

This dissertation investigates non-parametric regression over large function classes, specifically, non-Donsker classes. We will present the concept of non-Donsker classes and study the statistical performance of Least Squares Estimator (LSE) --- which also serves as the Maximum Likelihood Estimato...

Full description

Bibliographic Details
Main Author:	Kur, Gil
Other Authors:	Rakhlin, Alexander
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/152867

_version_	1826211908350377984
author	Kur, Gil
author2	Rakhlin, Alexander
author_facet	Rakhlin, Alexander Kur, Gil
author_sort	Kur, Gil
collection	MIT
description	This dissertation investigates non-parametric regression over large function classes, specifically, non-Donsker classes. We will present the concept of non-Donsker classes and study the statistical performance of Least Squares Estimator (LSE) --- which also serves as the Maximum Likelihood Estimator (MLE) under Gaussian noise --- over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of Birgé and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings. and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings.
first_indexed	2024-09-23T15:13:21Z
format	Thesis
id	mit-1721.1/152867
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T15:13:21Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1528672023-11-03T03:12:14Z On The Performance Of The Maximum Likelihood Over Large Models Kur, Gil Rakhlin, Alexander Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science This dissertation investigates non-parametric regression over large function classes, specifically, non-Donsker classes. We will present the concept of non-Donsker classes and study the statistical performance of Least Squares Estimator (LSE) --- which also serves as the Maximum Likelihood Estimator (MLE) under Gaussian noise --- over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of Birgé and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings. and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings. Ph.D. 2023-11-02T20:23:27Z 2023-11-02T20:23:27Z 2023-09 2023-09-21T14:26:07.968Z Thesis https://hdl.handle.net/1721.1/152867 0000-0001-7386-1686 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Kur, Gil On The Performance Of The Maximum Likelihood Over Large Models
title	On The Performance Of The Maximum Likelihood Over Large Models
title_full	On The Performance Of The Maximum Likelihood Over Large Models
title_fullStr	On The Performance Of The Maximum Likelihood Over Large Models
title_full_unstemmed	On The Performance Of The Maximum Likelihood Over Large Models
title_short	On The Performance Of The Maximum Likelihood Over Large Models
title_sort	on the performance of the maximum likelihood over large models
url	https://hdl.handle.net/1721.1/152867
work_keys_str_mv	AT kurgil ontheperformanceofthemaximumlikelihoodoverlargemodels

On The Performance Of The Maximum Likelihood Over Large Models

Similar Items