The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes

The article formulates the Dictionary Recognition problem, which is relevant for a wide range of applied problems: word recognition in a noisy audio signal for natural language processing tasks or in a noisy electromagnetic signal, recognition of visual patterns in limited visibility, and much more....

Full description

Bibliographic Details
Main Author:	Alexandra Bernadotte
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Mathematics
Subjects:	graph algorithm classification accuracy brain–computer interface robotics clique <i>k</i>-partite clique
Online Access:	https://www.mdpi.com/2227-7390/10/15/2810

_version_	1797441386821189632
author	Alexandra Bernadotte
author_facet	Alexandra Bernadotte
author_sort	Alexandra Bernadotte
collection	DOAJ
description	The article formulates the Dictionary Recognition problem, which is relevant for a wide range of applied problems: word recognition in a noisy audio signal for natural language processing tasks or in a noisy electromagnetic signal, recognition of visual patterns in limited visibility, and much more. A Dictionary Recognition problem is finding a set of words from a given set to maximize the classification accuracy of the words in the dictionary without losing semantic representation. The idea of solving the problem is to represent a set of objects (encoded as a sequence of symbols or visual sequences) in the form of a <i>k</i>-partite graph, where each partite of the graph corresponds to a group of objects with a certain common feature (equivalence class). The task is to find such a set of representatives of the <i>k</i> equivalence classes on which the <i>k</i>-classification accuracy by the classifier H meets certain criteria: (1) maximum classification accuracy; (2) maximin accuracy—the binary classification accuracy of every two objects is not lower than a certain value. The proposed Maximin Algorithm provides <i>k</i>-partite cliques with a maximin worst-case classification accuracy and belongs to the <i>P</i>-class. The Maximal Algorithm provides <i>k</i>-partite cliques with the maximum total weight (the problem belongs to the <i>NP</i>-hard class). The presented algorithms select a set of representatives optimally in terms of classification accuracy for the certain classifier and runtime. The algorithms increase classification accuracy when using classical classification methods without additional optimization of the classifiers themselves. We tested the algorithms on simulated data and provide an open-source project on GitHub. The results of the Maximin and Maximal Algorithms give 4-, 8- and 16-classification accuracy close to the best accuracy (obtained by brute-force enumeration) and better than the median accuracy by more than 20% for the support vector machine classifiers. Furthermore, the algorithms increase the selection speed of representatives by five orders of magnitude compared to the brute-force algorithm with a slight loss of accuracy.
first_indexed	2024-03-09T12:23:17Z
format	Article
id	doaj.art-ecd53adc602d4212ba0ec07a3717f806
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-09T12:23:17Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-ecd53adc602d4212ba0ec07a3717f8062023-11-30T22:39:07ZengMDPI AGMathematics2227-73902022-08-011015281010.3390/math10152810The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence ClassesAlexandra Bernadotte0AiChem, 170 71 Solna, SwedenThe article formulates the Dictionary Recognition problem, which is relevant for a wide range of applied problems: word recognition in a noisy audio signal for natural language processing tasks or in a noisy electromagnetic signal, recognition of visual patterns in limited visibility, and much more. A Dictionary Recognition problem is finding a set of words from a given set to maximize the classification accuracy of the words in the dictionary without losing semantic representation. The idea of solving the problem is to represent a set of objects (encoded as a sequence of symbols or visual sequences) in the form of a <i>k</i>-partite graph, where each partite of the graph corresponds to a group of objects with a certain common feature (equivalence class). The task is to find such a set of representatives of the <i>k</i> equivalence classes on which the <i>k</i>-classification accuracy by the classifier H meets certain criteria: (1) maximum classification accuracy; (2) maximin accuracy—the binary classification accuracy of every two objects is not lower than a certain value. The proposed Maximin Algorithm provides <i>k</i>-partite cliques with a maximin worst-case classification accuracy and belongs to the <i>P</i>-class. The Maximal Algorithm provides <i>k</i>-partite cliques with the maximum total weight (the problem belongs to the <i>NP</i>-hard class). The presented algorithms select a set of representatives optimally in terms of classification accuracy for the certain classifier and runtime. The algorithms increase classification accuracy when using classical classification methods without additional optimization of the classifiers themselves. We tested the algorithms on simulated data and provide an open-source project on GitHub. The results of the Maximin and Maximal Algorithms give 4-, 8- and 16-classification accuracy close to the best accuracy (obtained by brute-force enumeration) and better than the median accuracy by more than 20% for the support vector machine classifiers. Furthermore, the algorithms increase the selection speed of representatives by five orders of magnitude compared to the brute-force algorithm with a slight loss of accuracy.https://www.mdpi.com/2227-7390/10/15/2810graph algorithmclassification accuracybrain–computer interfaceroboticsclique<i>k</i>-partite clique
spellingShingle	Alexandra Bernadotte The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes Mathematics graph algorithm classification accuracy brain–computer interface robotics clique <i>k</i>-partite clique
title	The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes
title_full	The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes
title_fullStr	The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes
title_full_unstemmed	The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes
title_short	The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes
title_sort	algorithm that maximizes the accuracy of i k i classification on the set of representatives of the i k i equivalence classes
topic	graph algorithm classification accuracy brain–computer interface robotics clique <i>k</i>-partite clique
url	https://www.mdpi.com/2227-7390/10/15/2810
work_keys_str_mv	AT alexandrabernadotte thealgorithmthatmaximizestheaccuracyofikiclassificationonthesetofrepresentativesoftheikiequivalenceclasses AT alexandrabernadotte algorithmthatmaximizestheaccuracyofikiclassificationonthesetofrepresentativesoftheikiequivalenceclasses

The Algorithm That Maximizes the Accuracy of <i>k</i>-Classification on the Set of Representatives of the <i>k</i> Equivalence Classes

Similar Items