How to visualize high-dimensional data: a roadmap
International audience Discovery of the chronological or geographical distribution of collections of historical text can be more reliable when based on multivariate rather than on univariate data because multivariate data provide a more complete description. Where the data are high-dimensional, howe...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Nicolas Turenne
2020-12-01
|
Series: | Journal of Data Mining and Digital Humanities |
Subjects: | |
Online Access: | https://jdmdh.episciences.org/7021/pdf |
_version_ | 1818582771746996224 |
---|---|
author | Hermann Moisl |
author_facet | Hermann Moisl |
author_sort | Hermann Moisl |
collection | DOAJ |
description | International audience Discovery of the chronological or geographical distribution of collections of historical text can be more reliable when based on multivariate rather than on univariate data because multivariate data provide a more complete description. Where the data are high-dimensional, however, their complexity can defy analysis using traditional philological methods. The first step in dealing with such data is to visualize it using graphical methods in order to identify any latent structure. If found, such structure facilitates formulation of hypotheses which can be tested using a range of mathematical and statistical methods. Where, however, the dimensionality is greater than 3, direct graphical investigation is impossible. The present discussion presents a roadmap of how this obstacle can be overcome, and is in three main parts: the first part presents some fundamental data concepts, the second describes an example corpus and a high-dimensional data set derived from it, and the third outlines two approaches to visualization of that data set: dimensionality reduction and cluster analysis. |
first_indexed | 2024-12-16T07:54:41Z |
format | Article |
id | doaj.art-7655036ba85642e9901cfbbfc0a6bf5e |
institution | Directory Open Access Journal |
issn | 2416-5999 |
language | English |
last_indexed | 2024-12-16T07:54:41Z |
publishDate | 2020-12-01 |
publisher | Nicolas Turenne |
record_format | Article |
series | Journal of Data Mining and Digital Humanities |
spelling | doaj.art-7655036ba85642e9901cfbbfc0a6bf5e2022-12-21T22:38:46ZengNicolas TurenneJournal of Data Mining and Digital Humanities2416-59992020-12-01Special issue on Visualisations in Historical Linguisticsjdmdh:7021How to visualize high-dimensional data: a roadmapHermann MoislInternational audience Discovery of the chronological or geographical distribution of collections of historical text can be more reliable when based on multivariate rather than on univariate data because multivariate data provide a more complete description. Where the data are high-dimensional, however, their complexity can defy analysis using traditional philological methods. The first step in dealing with such data is to visualize it using graphical methods in order to identify any latent structure. If found, such structure facilitates formulation of hypotheses which can be tested using a range of mathematical and statistical methods. Where, however, the dimensionality is greater than 3, direct graphical investigation is impossible. The present discussion presents a roadmap of how this obstacle can be overcome, and is in three main parts: the first part presents some fundamental data concepts, the second describes an example corpus and a high-dimensional data set derived from it, and the third outlines two approaches to visualization of that data set: dimensionality reduction and cluster analysis.https://jdmdh.episciences.org/7021/pdfdimensionality reductionhigh dimensionalitymultivariate datadata visualizationcluster analysis[shs]humanities and social sciences |
spellingShingle | Hermann Moisl How to visualize high-dimensional data: a roadmap Journal of Data Mining and Digital Humanities dimensionality reduction high dimensionality multivariate data data visualization cluster analysis [shs]humanities and social sciences |
title | How to visualize high-dimensional data: a roadmap |
title_full | How to visualize high-dimensional data: a roadmap |
title_fullStr | How to visualize high-dimensional data: a roadmap |
title_full_unstemmed | How to visualize high-dimensional data: a roadmap |
title_short | How to visualize high-dimensional data: a roadmap |
title_sort | how to visualize high dimensional data a roadmap |
topic | dimensionality reduction high dimensionality multivariate data data visualization cluster analysis [shs]humanities and social sciences |
url | https://jdmdh.episciences.org/7021/pdf |
work_keys_str_mv | AT hermannmoisl howtovisualizehighdimensionaldataaroadmap |