Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods

Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to...

Full description

Bibliographic Details
Main Authors: Michael C. Thrun, PhD, Alfred Ultsch, Prof. Dr. habil.
Format: Article
Language:English
Published: Elsevier 2020-01-01
Series:MethodsX
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2215016120303137
_version_ 1818590980352245760
author Michael C. Thrun, PhD
Alfred Ultsch, Prof. Dr. habil.
author_facet Michael C. Thrun, PhD
Alfred Ultsch, Prof. Dr. habil.
author_sort Michael C. Thrun, PhD
collection DOAJ
description Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys.
first_indexed 2024-12-16T10:05:09Z
format Article
id doaj.art-03c37bea42264da78d71c92facd28767
institution Directory Open Access Journal
issn 2215-0161
language English
last_indexed 2024-12-16T10:05:09Z
publishDate 2020-01-01
publisher Elsevier
record_format Article
series MethodsX
spelling doaj.art-03c37bea42264da78d71c92facd287672022-12-21T22:35:41ZengElsevierMethodsX2215-01612020-01-017101093Uncovering High-dimensional Structures of Projections from Dimensionality Reduction MethodsMichael C. Thrun, PhD0Alfred Ultsch, Prof. Dr. habil.1Dept. of Hematology, Oncology and Immunology, Philipps-University of Marburg, Baldingerstraße, D-35043 Marburg; Corresponding author.Databionics Research Group, Philipps-University of Marburg, Hans-Meerwein-Straße 6, Marburg D-35032, GermanyProjections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys.http://www.sciencedirect.com/science/article/pii/S2215016120303137Dimensionality reductionProjection methodsData visualizationUnsupervised neural networksSelf-organizing maps
spellingShingle Michael C. Thrun, PhD
Alfred Ultsch, Prof. Dr. habil.
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
MethodsX
Dimensionality reduction
Projection methods
Data visualization
Unsupervised neural networks
Self-organizing maps
title Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
title_full Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
title_fullStr Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
title_full_unstemmed Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
title_short Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
title_sort uncovering high dimensional structures of projections from dimensionality reduction methods
topic Dimensionality reduction
Projection methods
Data visualization
Unsupervised neural networks
Self-organizing maps
url http://www.sciencedirect.com/science/article/pii/S2215016120303137
work_keys_str_mv AT michaelcthrunphd uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods
AT alfredultschprofdrhabil uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods