Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm
Dimensionality reduction techniques are often used by researchers in order to make high dimensional data easier to interpret visually, as data visualization is only possible in low dimensional spaces. Recent research in nonlinear dimensionality reduction introduced many effective algorithms, includi...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-04-01
|
Series: | Journal of Imaging |
Subjects: | |
Online Access: | https://www.mdpi.com/2313-433X/8/4/113 |
_version_ | 1797445614724710400 |
---|---|
author | Liliya A. Demidova Artyom V. Gorchakov |
author_facet | Liliya A. Demidova Artyom V. Gorchakov |
author_sort | Liliya A. Demidova |
collection | DOAJ |
description | Dimensionality reduction techniques are often used by researchers in order to make high dimensional data easier to interpret visually, as data visualization is only possible in low dimensional spaces. Recent research in nonlinear dimensionality reduction introduced many effective algorithms, including t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), dimensionality reduction technique based on triplet constraints (TriMAP), and pairwise controlled manifold approximation (PaCMAP), aimed to preserve both the local and global structure of high dimensional data while reducing the dimensionality. The UMAP algorithm has found its application in bioinformatics, genetics, genomics, and has been widely used to improve the accuracy of other machine learning algorithms. In this research, we compare the performance of different fuzzy information discrimination measures used as loss functions in the UMAP algorithm while constructing low dimensional embeddings. In order to achieve this, we derive the gradients of the considered losses analytically and employ the Adam algorithm during the loss function optimization process. From the conducted experimental studies we conclude that the use of either the logarithmic fuzzy cross entropy loss without reduced repulsion or the symmetric logarithmic fuzzy cross entropy loss with sufficiently large neighbor count leads to better global structure preservation of the original multidimensional data when compared to the loss function used in the original UMAP algorithm implementation. |
first_indexed | 2024-03-09T13:29:25Z |
format | Article |
id | doaj.art-9ab65ff7a87c47e7a7279fd6238ae148 |
institution | Directory Open Access Journal |
issn | 2313-433X |
language | English |
last_indexed | 2024-03-09T13:29:25Z |
publishDate | 2022-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Journal of Imaging |
spelling | doaj.art-9ab65ff7a87c47e7a7279fd6238ae1482023-11-30T21:20:39ZengMDPI AGJournal of Imaging2313-433X2022-04-018411310.3390/jimaging8040113Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP AlgorithmLiliya A. Demidova0Artyom V. Gorchakov1Institute of Information Technologies, Federal State Budget Educational Institution of Higher Education “MIREA–Russian Technological University”, 78, Vernadsky Avenue, 119454 Moscow, RussiaInstitute of Information Technologies, Federal State Budget Educational Institution of Higher Education “MIREA–Russian Technological University”, 78, Vernadsky Avenue, 119454 Moscow, RussiaDimensionality reduction techniques are often used by researchers in order to make high dimensional data easier to interpret visually, as data visualization is only possible in low dimensional spaces. Recent research in nonlinear dimensionality reduction introduced many effective algorithms, including t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), dimensionality reduction technique based on triplet constraints (TriMAP), and pairwise controlled manifold approximation (PaCMAP), aimed to preserve both the local and global structure of high dimensional data while reducing the dimensionality. The UMAP algorithm has found its application in bioinformatics, genetics, genomics, and has been widely used to improve the accuracy of other machine learning algorithms. In this research, we compare the performance of different fuzzy information discrimination measures used as loss functions in the UMAP algorithm while constructing low dimensional embeddings. In order to achieve this, we derive the gradients of the considered losses analytically and employ the Adam algorithm during the loss function optimization process. From the conducted experimental studies we conclude that the use of either the logarithmic fuzzy cross entropy loss without reduced repulsion or the symmetric logarithmic fuzzy cross entropy loss with sufficiently large neighbor count leads to better global structure preservation of the original multidimensional data when compared to the loss function used in the original UMAP algorithm implementation.https://www.mdpi.com/2313-433X/8/4/113dimension reductiondata visualizationentropycross-entropyfuzzy logic |
spellingShingle | Liliya A. Demidova Artyom V. Gorchakov Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm Journal of Imaging dimension reduction data visualization entropy cross-entropy fuzzy logic |
title | Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm |
title_full | Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm |
title_fullStr | Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm |
title_full_unstemmed | Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm |
title_short | Fuzzy Information Discrimination Measures and Their Application to Low Dimensional Embedding Construction in the UMAP Algorithm |
title_sort | fuzzy information discrimination measures and their application to low dimensional embedding construction in the umap algorithm |
topic | dimension reduction data visualization entropy cross-entropy fuzzy logic |
url | https://www.mdpi.com/2313-433X/8/4/113 |
work_keys_str_mv | AT liliyaademidova fuzzyinformationdiscriminationmeasuresandtheirapplicationtolowdimensionalembeddingconstructionintheumapalgorithm AT artyomvgorchakov fuzzyinformationdiscriminationmeasuresandtheirapplicationtolowdimensionalembeddingconstructionintheumapalgorithm |