Polynomial time algorithms for dual volume sampling

© 2017 Neural information processing systems foundation. All rights reserved. We study dual volume sampling, a method for selecting k columns from an n × m short and wide matrix (n ≤ k ≤ m) such that the probability of selection is proportional to the volume spanned by the rows of the induced submat...

Full description

Bibliographic Details
Main Authors: Li, C, Jegelka, S, Sra, S
Format: Article
Language:English
Published: 2021
Online Access:https://hdl.handle.net/1721.1/132306
_version_ 1811072763110096896
author Li, C
Jegelka, S
Sra, S
author_facet Li, C
Jegelka, S
Sra, S
author_sort Li, C
collection MIT
description © 2017 Neural information processing systems foundation. All rights reserved. We study dual volume sampling, a method for selecting k columns from an n × m short and wide matrix (n ≤ k ≤ m) such that the probability of selection is proportional to the volume spanned by the rows of the induced submatrix. This method was proposed by Avron and Boutsidis (2013), who showed it to be a promising method for column subset selection and its multiple applications. However, its wider adoption has been hampered by the lack of polynomial time sampling algorithms. We remove this hindrance by developing an exact (randomized) polynomial time sampling algorithm as well as its derandomization. Thereafter, we study dual volume sampling via the theory of real stable polynomials and prove that its distribution satisfies the "Strong Rayleigh" property. This result has numerous consequences, including a provably fast-mixing Markov chain sampler that makes dual volume sampling much more attractive to practitioners. This sampler is closely related to classical algorithms for popular experimental design methods that are to date lacking theoretical analysis but are known to empirically work well.
first_indexed 2024-09-23T09:12:04Z
format Article
id mit-1721.1/132306
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T09:12:04Z
publishDate 2021
record_format dspace
spelling mit-1721.1/1323062021-09-21T03:59:21Z Polynomial time algorithms for dual volume sampling Li, C Jegelka, S Sra, S © 2017 Neural information processing systems foundation. All rights reserved. We study dual volume sampling, a method for selecting k columns from an n × m short and wide matrix (n ≤ k ≤ m) such that the probability of selection is proportional to the volume spanned by the rows of the induced submatrix. This method was proposed by Avron and Boutsidis (2013), who showed it to be a promising method for column subset selection and its multiple applications. However, its wider adoption has been hampered by the lack of polynomial time sampling algorithms. We remove this hindrance by developing an exact (randomized) polynomial time sampling algorithm as well as its derandomization. Thereafter, we study dual volume sampling via the theory of real stable polynomials and prove that its distribution satisfies the "Strong Rayleigh" property. This result has numerous consequences, including a provably fast-mixing Markov chain sampler that makes dual volume sampling much more attractive to practitioners. This sampler is closely related to classical algorithms for popular experimental design methods that are to date lacking theoretical analysis but are known to empirically work well. 2021-09-20T18:21:46Z 2021-09-20T18:21:46Z 2020-12-21T19:24:48Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/132306 en Advances in Neural Information Processing Systems Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems (NIPS)
spellingShingle Li, C
Jegelka, S
Sra, S
Polynomial time algorithms for dual volume sampling
title Polynomial time algorithms for dual volume sampling
title_full Polynomial time algorithms for dual volume sampling
title_fullStr Polynomial time algorithms for dual volume sampling
title_full_unstemmed Polynomial time algorithms for dual volume sampling
title_short Polynomial time algorithms for dual volume sampling
title_sort polynomial time algorithms for dual volume sampling
url https://hdl.handle.net/1721.1/132306
work_keys_str_mv AT lic polynomialtimealgorithmsfordualvolumesampling
AT jegelkas polynomialtimealgorithmsfordualvolumesampling
AT sras polynomialtimealgorithmsfordualvolumesampling