Summary: | We study two constrained problems in high dimensions. We study a high dimensional inequality for the binary entropy. The perceptron is a natural model in high-dimensional probability, and a toy shallow neural network which stores random patterns; we also study a randomized variant of the symmetric binary perceptron.
We first consider the (k + 1)-th derivative of xᵏ⁻ʳH(xʳ), where H(x) := −x log x − (1 − x) log (1 − x),0 ≤ x ≤ 1 is the binary entropy and k ≥ r ≥ 1 are integers. Our motivation is the conjectural entropy inequality αₖH(xᵏ) ≥ xᵏ⁻¹H(x), where 0 < αₖ < 1 is given by a functional equation. The k = 2 case was the key technical tool driving recent breakthroughs on the union-closed sets conjecture, and the k → ∞ case can be considered the "high dimensional limit". We express ((dᵏ⁺¹) / (dxᵏ⁺¹)) xᵏ⁻ʳH(xʳ) as a rational function, an infinite series, and a sum over generalized Stirling numbers. This allows us to reduce the proof of the entropy inequality for real k to showing that an associated polynomial has only two real roots in the interval (0,1). This reduction allows us to easily verify the inequality for fixed k such as k = 2,3,4 with a finite calculation, and also allows us to prove the inequality for any fixed fractional exponent such as k = 3/2 via a finite calculation. The proof suggests a new framework for proving tight inequalities for the sum of polynomials times the logarithms of polynomials, which converts the inequality into a statement about the real roots of a simpler associated polynomial.
The symmetric binary perceptron (SBP) is a random constraint satisfaction problem (CSP) and a single-layer neural network; it exhibits intriguing features, most notably a sharp phase transition regarding the existence of its satisfying solutions. Secondly, we propose two novel generalizations of the SBP by incorporating random labels. Our proposals admit a natural machine learning interpretation: any satisfying solution to the random CSP is a minimizer of a certain empirical risk. We establish that the expected number of solutions for both models undergoes a sharp phase transition and calculate the location of this transition, which corresponds to the annealed capacity in statistical physics. We then establish, through the Berry-Esseen theorem, a universality result: the location of this transition does not depend on the underlying distribution. We conjecture that both models in fact exhibit an even stronger phase transition akin to the SBP and give rigorous evidence towards this conjecture through the second moment method. Our final focus is on the algorithmic problem of efficiently finding a satisfying solution to our models. We show that both models exhibit the multi Overlap Gap Property (m-OGP), an intricate geometrical property of the solution space which is known to be a rigorous barrier against large classes of algorithms. This gives rigorous evidence of a statistical-to-computational gap for both models. We also show that the m-OGP satisfies a similar universality property.
|