Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
We consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3....
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-04-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/24/5/635 |
_version_ | 1797500020548698112 |
---|---|
author | Hiroto Kuramata Hideki Yagi |
author_facet | Hiroto Kuramata Hideki Yagi |
author_sort | Hiroto Kuramata |
collection | DOAJ |
description | We consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula>. We analyze the asymptotic fundamental limits of statistical classification for sources with multiple subclasses. We investigate the first- and second-order maximum error exponents under the constraint that the type-I error probability for all pairs of distributions decays exponentially fast and the type-II error probability is upper bounded by a small constant. In this paper, we first give a classifier which achieves the asymptotically maximum error exponent in the class of deterministic classifiers for sources with multiple subclasses, and then provide a characterization of the first-order error exponent. We next provide a characterization of the second-order error exponent in the case where only <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> has multiple subclasses but <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> does not. We generalize our results to classification in the case that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> are a stationary and memoryless source and a mixed memoryless source with general mixture, respectively. |
first_indexed | 2024-03-10T03:55:50Z |
format | Article |
id | doaj.art-27ad196ae7054b22a83b5bd56b13fef7 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-03-10T03:55:50Z |
publishDate | 2022-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-27ad196ae7054b22a83b5bd56b13fef72023-11-23T10:54:56ZengMDPI AGEntropy1099-43002022-04-0124563510.3390/e24050635Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple SubclassesHiroto Kuramata0Hideki Yagi1Department of Computer and Network Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu 182-8585, Tokyo, JapanDepartment of Computer and Network Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu 182-8585, Tokyo, JapanWe consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula>. We analyze the asymptotic fundamental limits of statistical classification for sources with multiple subclasses. We investigate the first- and second-order maximum error exponents under the constraint that the type-I error probability for all pairs of distributions decays exponentially fast and the type-II error probability is upper bounded by a small constant. In this paper, we first give a classifier which achieves the asymptotically maximum error exponent in the class of deterministic classifiers for sources with multiple subclasses, and then provide a characterization of the first-order error exponent. We next provide a characterization of the second-order error exponent in the case where only <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> has multiple subclasses but <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> does not. We generalize our results to classification in the case that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> are a stationary and memoryless source and a mixed memoryless source with general mixture, respectively.https://www.mdpi.com/1099-4300/24/5/635binary classificationerror exponentmultiple subclasses |
spellingShingle | Hiroto Kuramata Hideki Yagi Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses Entropy binary classification error exponent multiple subclasses |
title | Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses |
title_full | Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses |
title_fullStr | Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses |
title_full_unstemmed | Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses |
title_short | Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses |
title_sort | analysis on optimal error exponents of binary classification for source with multiple subclasses |
topic | binary classification error exponent multiple subclasses |
url | https://www.mdpi.com/1099-4300/24/5/635 |
work_keys_str_mv | AT hirotokuramata analysisonoptimalerrorexponentsofbinaryclassificationforsourcewithmultiplesubclasses AT hidekiyagi analysisonoptimalerrorexponentsofbinaryclassificationforsourcewithmultiplesubclasses |