Distribution-Dependent Weighted Union Bound
In this paper, we deal with the classical Statistical Learning Theory’s problem of bounding, with high probability, the true risk <inline-formula><math display="inline"><semantics><mrow><mi>R</mi><mo>(</mo><mi>h</mi><mo>)<...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-01-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/23/1/101 |
_version_ | 1797412554312515584 |
---|---|
author | Luca Oneto Sandro Ridella |
author_facet | Luca Oneto Sandro Ridella |
author_sort | Luca Oneto |
collection | DOAJ |
description | In this paper, we deal with the classical Statistical Learning Theory’s problem of bounding, with high probability, the true risk <inline-formula><math display="inline"><semantics><mrow><mi>R</mi><mo>(</mo><mi>h</mi><mo>)</mo></mrow></semantics></math></inline-formula> of a hypothesis <i>h</i> chosen from a set <inline-formula><math display="inline"><semantics><mi mathvariant="script">H</mi></semantics></math></inline-formula> of <i>m</i> hypotheses. The Union Bound (UB) allows one to state that <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mfenced separators="" open="{" close="}"><mi mathvariant="monospace">L</mi><mfenced separators="" open="(" close=")"><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><msub><mi>q</mi><mi>h</mi></msub></mfenced><mo>≤</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≤</mo><mi mathvariant="monospace">U</mi><mfenced separators="" open="(" close=")"><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><msub><mi>p</mi><mi>h</mi></msub></mfenced></mfenced><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula> where <inline-formula><math display="inline"><semantics><mrow><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula> is the empirical errors, if it is possible to prove that <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mo>{</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≥</mo><mi mathvariant="monospace">L</mi><mrow><mo>(</mo><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><mo>)</mo></mrow><mo>}</mo><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mo>{</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≤</mo><mi mathvariant="monospace">U</mi><mrow><mo>(</mo><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><mo>)</mo></mrow><mo>}</mo><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula>, when <i>h</i>, <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula>, and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> are chosen before seeing the data such that <inline-formula><math display="inline"><semantics><mrow><msub><mi>q</mi><mi>h</mi></msub><mo>,</mo><msub><mi>p</mi><mi>h</mi></msub><mo>∈</mo><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow></mrow></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><mrow><msub><mo>∑</mo><mrow><mi>h</mi><mo>∈</mo><mi mathvariant="script">H</mi></mrow></msub><mrow><mo>(</mo><msub><mi>q</mi><mi>h</mi></msub><mo>+</mo><msub><mi>p</mi><mi>h</mi></msub><mo>)</mo></mrow><mo>=</mo><mn>1</mn></mrow></semantics></math></inline-formula>. If no <i>a priori</i> information is available <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> are set to <inline-formula><math display="inline"><semantics><mfrac bevelled="true"><mn>1</mn><mrow><mn>2</mn><mi>m</mi></mrow></mfrac></semantics></math></inline-formula>, namely equally distributed. This approach gives poor results since, as a matter of fact, a learning procedure targets just particular hypotheses, namely hypotheses with small empirical error, disregarding the others. In this work we set the <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> in a distribution-dependent way increasing the probability of being chosen to function with small true risk. We will call this proposal Distribution-Dependent Weighted UB (DDWUB) and we will retrieve the sufficient conditions on the choice of <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> that state that DDWUB outperforms or, in the worst case, degenerates into UB. Furthermore, theoretical and numerical results will show the applicability, the validity, and the potentiality of DDWUB. |
first_indexed | 2024-03-09T05:04:49Z |
format | Article |
id | doaj.art-2f1ac63c34f44d52b7ae824395f62142 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-09T05:04:49Z |
publishDate | 2021-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-2f1ac63c34f44d52b7ae824395f621422023-12-03T12:55:45ZengMDPI AGEntropy1099-43002021-01-0123110110.3390/e23010101Distribution-Dependent Weighted Union BoundLuca Oneto0Sandro Ridella1Department of Computer Science, Bioengineering, Robotics and Systems Engineering, University of Genoa, Via Opera Pia 11a, 16145 Genova, ItalyDepartment of Biophysical and Electronic Engineering, University of Genoa, Via Opera Pia 11a, 16145 Genova, ItalyIn this paper, we deal with the classical Statistical Learning Theory’s problem of bounding, with high probability, the true risk <inline-formula><math display="inline"><semantics><mrow><mi>R</mi><mo>(</mo><mi>h</mi><mo>)</mo></mrow></semantics></math></inline-formula> of a hypothesis <i>h</i> chosen from a set <inline-formula><math display="inline"><semantics><mi mathvariant="script">H</mi></semantics></math></inline-formula> of <i>m</i> hypotheses. The Union Bound (UB) allows one to state that <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mfenced separators="" open="{" close="}"><mi mathvariant="monospace">L</mi><mfenced separators="" open="(" close=")"><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><msub><mi>q</mi><mi>h</mi></msub></mfenced><mo>≤</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≤</mo><mi mathvariant="monospace">U</mi><mfenced separators="" open="(" close=")"><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><msub><mi>p</mi><mi>h</mi></msub></mfenced></mfenced><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula> where <inline-formula><math display="inline"><semantics><mrow><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula> is the empirical errors, if it is possible to prove that <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mo>{</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≥</mo><mi mathvariant="monospace">L</mi><mrow><mo>(</mo><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><mo>)</mo></mrow><mo>}</mo><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><mrow><mi mathvariant="double-struck">P</mi><mo>{</mo><mi>R</mi><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>≤</mo><mi mathvariant="monospace">U</mi><mrow><mo>(</mo><mover accent="true"><mi>R</mi><mo>^</mo></mover><mrow><mo>(</mo><mi>h</mi><mo>)</mo></mrow><mo>,</mo><mi>δ</mi><mo>)</mo></mrow><mo>}</mo><mo>≥</mo><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></semantics></math></inline-formula>, when <i>h</i>, <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula>, and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> are chosen before seeing the data such that <inline-formula><math display="inline"><semantics><mrow><msub><mi>q</mi><mi>h</mi></msub><mo>,</mo><msub><mi>p</mi><mi>h</mi></msub><mo>∈</mo><mrow><mo>[</mo><mn>0</mn><mo>,</mo><mn>1</mn><mo>]</mo></mrow></mrow></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><mrow><msub><mo>∑</mo><mrow><mi>h</mi><mo>∈</mo><mi mathvariant="script">H</mi></mrow></msub><mrow><mo>(</mo><msub><mi>q</mi><mi>h</mi></msub><mo>+</mo><msub><mi>p</mi><mi>h</mi></msub><mo>)</mo></mrow><mo>=</mo><mn>1</mn></mrow></semantics></math></inline-formula>. If no <i>a priori</i> information is available <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> are set to <inline-formula><math display="inline"><semantics><mfrac bevelled="true"><mn>1</mn><mrow><mn>2</mn><mi>m</mi></mrow></mfrac></semantics></math></inline-formula>, namely equally distributed. This approach gives poor results since, as a matter of fact, a learning procedure targets just particular hypotheses, namely hypotheses with small empirical error, disregarding the others. In this work we set the <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> in a distribution-dependent way increasing the probability of being chosen to function with small true risk. We will call this proposal Distribution-Dependent Weighted UB (DDWUB) and we will retrieve the sufficient conditions on the choice of <inline-formula><math display="inline"><semantics><msub><mi>q</mi><mi>h</mi></msub></semantics></math></inline-formula> and <inline-formula><math display="inline"><semantics><msub><mi>p</mi><mi>h</mi></msub></semantics></math></inline-formula> that state that DDWUB outperforms or, in the worst case, degenerates into UB. Furthermore, theoretical and numerical results will show the applicability, the validity, and the potentiality of DDWUB.https://www.mdpi.com/1099-4300/23/1/101union boundweighted union bounddistribution-dependent weightsstatistical learning theoryfinite number of hypothesis |
spellingShingle | Luca Oneto Sandro Ridella Distribution-Dependent Weighted Union Bound Entropy union bound weighted union bound distribution-dependent weights statistical learning theory finite number of hypothesis |
title | Distribution-Dependent Weighted Union Bound |
title_full | Distribution-Dependent Weighted Union Bound |
title_fullStr | Distribution-Dependent Weighted Union Bound |
title_full_unstemmed | Distribution-Dependent Weighted Union Bound |
title_short | Distribution-Dependent Weighted Union Bound |
title_sort | distribution dependent weighted union bound |
topic | union bound weighted union bound distribution-dependent weights statistical learning theory finite number of hypothesis |
url | https://www.mdpi.com/1099-4300/23/1/101 |
work_keys_str_mv | AT lucaoneto distributiondependentweightedunionbound AT sandroridella distributiondependentweightedunionbound |