Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods
Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-01-01
|
Series: | MethodsX |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2215016120303137 |
_version_ | 1818590980352245760 |
---|---|
author | Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. |
author_facet | Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. |
author_sort | Michael C. Thrun, PhD |
collection | DOAJ |
description | Projections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys. |
first_indexed | 2024-12-16T10:05:09Z |
format | Article |
id | doaj.art-03c37bea42264da78d71c92facd28767 |
institution | Directory Open Access Journal |
issn | 2215-0161 |
language | English |
last_indexed | 2024-12-16T10:05:09Z |
publishDate | 2020-01-01 |
publisher | Elsevier |
record_format | Article |
series | MethodsX |
spelling | doaj.art-03c37bea42264da78d71c92facd287672022-12-21T22:35:41ZengElsevierMethodsX2215-01612020-01-017101093Uncovering High-dimensional Structures of Projections from Dimensionality Reduction MethodsMichael C. Thrun, PhD0Alfred Ultsch, Prof. Dr. habil.1Dept. of Hematology, Oncology and Immunology, Philipps-University of Marburg, Baldingerstraße, D-35043 Marburg; Corresponding author.Databionics Research Group, Philipps-University of Marburg, Hans-Meerwein-Straße 6, Marburg D-35032, GermanyProjections are conventional methods of dimensionality reduction for information visualization used to transform high-dimensional data into low dimensional space. If the projection method restricts the output space to two dimensions, the result is a scatter plot. The goal of this scatter plot is to visualize the relative relationships between high-dimensional data points that build up distance and density-based structures. However, the Johnson–Lindenstrauss lemma states that the two-dimensional similarities in the scatter plot cannot coercively represent high-dimensional structures. Here, a simplified emergent self-organizing map uses the projected points of such a scatter plot in combination with the dataset in order to compute the generalized U-matrix. The generalized U-matrix defines the visualization of a topographic map depicting the misrepresentations of projected points with regards to a given dimensionality reduction method and the dataset. • The topographic map provides accurate information about the high-dimensional distance and density based structures of high-dimensional data if an appropriate dimensionality reduction method is selected. • The topographic map can uncover the absence of distance-based structures. • The topographic map reveals the number of clusters in a dataset as the number of valleys.http://www.sciencedirect.com/science/article/pii/S2215016120303137Dimensionality reductionProjection methodsData visualizationUnsupervised neural networksSelf-organizing maps |
spellingShingle | Michael C. Thrun, PhD Alfred Ultsch, Prof. Dr. habil. Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods MethodsX Dimensionality reduction Projection methods Data visualization Unsupervised neural networks Self-organizing maps |
title | Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_full | Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_fullStr | Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_full_unstemmed | Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_short | Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods |
title_sort | uncovering high dimensional structures of projections from dimensionality reduction methods |
topic | Dimensionality reduction Projection methods Data visualization Unsupervised neural networks Self-organizing maps |
url | http://www.sciencedirect.com/science/article/pii/S2215016120303137 |
work_keys_str_mv | AT michaelcthrunphd uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods AT alfredultschprofdrhabil uncoveringhighdimensionalstructuresofprojectionsfromdimensionalityreductionmethods |