Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses

We consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3....

Full description

Bibliographic Details
Main Authors: Hiroto Kuramata, Hideki Yagi
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/24/5/635
_version_ 1797500020548698112
author Hiroto Kuramata
Hideki Yagi
author_facet Hiroto Kuramata
Hideki Yagi
author_sort Hiroto Kuramata
collection DOAJ
description We consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula>. We analyze the asymptotic fundamental limits of statistical classification for sources with multiple subclasses. We investigate the first- and second-order maximum error exponents under the constraint that the type-I error probability for all pairs of distributions decays exponentially fast and the type-II error probability is upper bounded by a small constant. In this paper, we first give a classifier which achieves the asymptotically maximum error exponent in the class of deterministic classifiers for sources with multiple subclasses, and then provide a characterization of the first-order error exponent. We next provide a characterization of the second-order error exponent in the case where only <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> has multiple subclasses but <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> does not. We generalize our results to classification in the case that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> are a stationary and memoryless source and a mixed memoryless source with general mixture, respectively.
first_indexed 2024-03-10T03:55:50Z
format Article
id doaj.art-27ad196ae7054b22a83b5bd56b13fef7
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-10T03:55:50Z
publishDate 2022-04-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-27ad196ae7054b22a83b5bd56b13fef72023-11-23T10:54:56ZengMDPI AGEntropy1099-43002022-04-0124563510.3390/e24050635Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple SubclassesHiroto Kuramata0Hideki Yagi1Department of Computer and Network Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu 182-8585, Tokyo, JapanDepartment of Computer and Network Engineering, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu 182-8585, Tokyo, JapanWe consider a binary classification problem for a test sequence to determine from which source the sequence is generated. The system classifies the test sequence based on empirically observed (training) sequences obtained from unknown sources <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula>. We analyze the asymptotic fundamental limits of statistical classification for sources with multiple subclasses. We investigate the first- and second-order maximum error exponents under the constraint that the type-I error probability for all pairs of distributions decays exponentially fast and the type-II error probability is upper bounded by a small constant. In this paper, we first give a classifier which achieves the asymptotically maximum error exponent in the class of deterministic classifiers for sources with multiple subclasses, and then provide a characterization of the first-order error exponent. We next provide a characterization of the second-order error exponent in the case where only <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> has multiple subclasses but <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> does not. We generalize our results to classification in the case that <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>1</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>P</mi><mn>2</mn></msub></semantics></math></inline-formula> are a stationary and memoryless source and a mixed memoryless source with general mixture, respectively.https://www.mdpi.com/1099-4300/24/5/635binary classificationerror exponentmultiple subclasses
spellingShingle Hiroto Kuramata
Hideki Yagi
Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
Entropy
binary classification
error exponent
multiple subclasses
title Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
title_full Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
title_fullStr Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
title_full_unstemmed Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
title_short Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses
title_sort analysis on optimal error exponents of binary classification for source with multiple subclasses
topic binary classification
error exponent
multiple subclasses
url https://www.mdpi.com/1099-4300/24/5/635
work_keys_str_mv AT hirotokuramata analysisonoptimalerrorexponentsofbinaryclassificationforsourcewithmultiplesubclasses
AT hidekiyagi analysisonoptimalerrorexponentsofbinaryclassificationforsourcewithmultiplesubclasses