New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
We present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a n...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-06-01
|
Series: | Visual Informatics |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2468502X22000201 |
_version_ | 1828438673413111808 |
---|---|
author | Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff |
author_facet | Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff |
author_sort | Robert Gove |
collection | DOAJ |
description | We present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a new method to featurize data sets using graph-based metrics called scagnostics; we use these features to train a neural network that predicts optimal t-SNE hyperparameters for the respective data set. This neural network has the potential to simplify the use of t-SNE by removing guesswork about which hyperparameters will produce the best embedding. We evaluate and compare our neural network-derived and empirically optimum hyperparameters to several other t-SNE hyperparameter guidelines from the literature on 68 data sets. The hyperparameters predicted by our neural network yield embeddings with similar accuracy as the best current t-SNE guidelines. Using our empirically optimum hyperparameters is simpler than following previously published guidelines but yields more accurate embeddings, in some cases by a statistically significant margin. We find that the useful ranges for t-SNE hyperparameters are narrower and include smaller values than previously reported in the literature. Importantly, we also quantify the potential for future improvements in this area: using data from a grid search of t-SNE hyperparameters we find that an optimal selection method could improve embedding accuracy by up to two percentage points over the methods examined in this paper. |
first_indexed | 2024-12-10T20:08:32Z |
format | Article |
id | doaj.art-594babfbfc27491ca8f51a4869de6823 |
institution | Directory Open Access Journal |
issn | 2468-502X |
language | English |
last_indexed | 2024-12-10T20:08:32Z |
publishDate | 2022-06-01 |
publisher | Elsevier |
record_format | Article |
series | Visual Informatics |
spelling | doaj.art-594babfbfc27491ca8f51a4869de68232022-12-22T01:35:20ZengElsevierVisual Informatics2468-502X2022-06-01628797New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluationRobert Gove0Lucas Cadalzo1Nicholas Leiby2Jedediah M. Singer3Alexander Zaitzeff4Corresponding author.; Two Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USAWe present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a new method to featurize data sets using graph-based metrics called scagnostics; we use these features to train a neural network that predicts optimal t-SNE hyperparameters for the respective data set. This neural network has the potential to simplify the use of t-SNE by removing guesswork about which hyperparameters will produce the best embedding. We evaluate and compare our neural network-derived and empirically optimum hyperparameters to several other t-SNE hyperparameter guidelines from the literature on 68 data sets. The hyperparameters predicted by our neural network yield embeddings with similar accuracy as the best current t-SNE guidelines. Using our empirically optimum hyperparameters is simpler than following previously published guidelines but yields more accurate embeddings, in some cases by a statistically significant margin. We find that the useful ranges for t-SNE hyperparameters are narrower and include smaller values than previously reported in the literature. Importantly, we also quantify the potential for future improvements in this area: using data from a grid search of t-SNE hyperparameters we find that an optimal selection method could improve embedding accuracy by up to two percentage points over the methods examined in this paper.http://www.sciencedirect.com/science/article/pii/S2468502X22000201Dimensionality reductionMachine learningt-SNE |
spellingShingle | Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation Visual Informatics Dimensionality reduction Machine learning t-SNE |
title | New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation |
title_full | New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation |
title_fullStr | New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation |
title_full_unstemmed | New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation |
title_short | New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation |
title_sort | new guidance for using t sne alternative defaults hyperparameter selection automation and comparative evaluation |
topic | Dimensionality reduction Machine learning t-SNE |
url | http://www.sciencedirect.com/science/article/pii/S2468502X22000201 |
work_keys_str_mv | AT robertgove newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT lucascadalzo newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT nicholasleiby newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT jedediahmsinger newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT alexanderzaitzeff newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation |