Efficient Statistics, in High Dimensions, from Truncated Samples

We provide an efficient algorithm for the classical problem, going back to Galton, Pearson,and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from ad-variate normal N(μ,Σ) means a samples is only revealed...

Ամբողջական նկարագրություն

Մատենագիտական մանրամասներ
Հիմնական հեղինակներ: Daskalakis, Constantinos, Gouleakis, Themis, Tzamos, Chistos, Zampetakis, Manolis
Այլ հեղինակներ: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Ձևաչափ: Հոդված
Լեզու:English
Հրապարակվել է: Institute of Electrical and Electronics Engineers (IEEE) 2021
Առցանց հասանելիություն:https://hdl.handle.net/1721.1/137449
_version_ 1826207295022825472
author Daskalakis, Constantinos
Gouleakis, Themis
Tzamos, Chistos
Zampetakis, Manolis
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Daskalakis, Constantinos
Gouleakis, Themis
Tzamos, Chistos
Zampetakis, Manolis
author_sort Daskalakis, Constantinos
collection MIT
description We provide an efficient algorithm for the classical problem, going back to Galton, Pearson,and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from ad-variate normal N(μ,Σ) means a samples is only revealed if it falls in some subset S⊆Rd; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the meanμand covariance matrixΣcan be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to S, and S has non-trivial measure under the unknown d-variate normal distribution. Additionally we show that without oracle access to S, any non-trivial estimation is impossible.
first_indexed 2024-09-23T13:47:05Z
format Article
id mit-1721.1/137449
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T13:47:05Z
publishDate 2021
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1374492022-09-28T16:10:21Z Efficient Statistics, in High Dimensions, from Truncated Samples Daskalakis, Constantinos Gouleakis, Themis Tzamos, Chistos Zampetakis, Manolis Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory We provide an efficient algorithm for the classical problem, going back to Galton, Pearson,and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from ad-variate normal N(μ,Σ) means a samples is only revealed if it falls in some subset S⊆Rd; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the meanμand covariance matrixΣcan be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to S, and S has non-trivial measure under the unknown d-variate normal distribution. Additionally we show that without oracle access to S, any non-trivial estimation is impossible. 2021-11-05T13:34:56Z 2021-11-05T13:34:56Z 2018-10 2019-05-17T15:14:02Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137449 Daskalakis, Constantinos, Gouleakis, Themis, Tzamos, Chistos and Zampetakis, Manolis. 2018. "Efficient Statistics, in High Dimensions, from Truncated Samples." en 10.1109/focs.2018.00067 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) arXiv
spellingShingle Daskalakis, Constantinos
Gouleakis, Themis
Tzamos, Chistos
Zampetakis, Manolis
Efficient Statistics, in High Dimensions, from Truncated Samples
title Efficient Statistics, in High Dimensions, from Truncated Samples
title_full Efficient Statistics, in High Dimensions, from Truncated Samples
title_fullStr Efficient Statistics, in High Dimensions, from Truncated Samples
title_full_unstemmed Efficient Statistics, in High Dimensions, from Truncated Samples
title_short Efficient Statistics, in High Dimensions, from Truncated Samples
title_sort efficient statistics in high dimensions from truncated samples
url https://hdl.handle.net/1721.1/137449
work_keys_str_mv AT daskalakisconstantinos efficientstatisticsinhighdimensionsfromtruncatedsamples
AT gouleakisthemis efficientstatisticsinhighdimensionsfromtruncatedsamples
AT tzamoschistos efficientstatisticsinhighdimensionsfromtruncatedsamples
AT zampetakismanolis efficientstatisticsinhighdimensionsfromtruncatedsamples