New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation

We present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a n...

Full description

Bibliographic Details
Main Authors:	Robert Gove, Lucas Cadalzo, Nicholas Leiby, Jedediah M. Singer, Alexander Zaitzeff
Format:	Article
Language:	English
Published:	Elsevier 2022-06-01
Series:	Visual Informatics
Subjects:	Dimensionality reduction Machine learning t-SNE
Online Access:	http://www.sciencedirect.com/science/article/pii/S2468502X22000201

_version_	1828438673413111808
author	Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff
author_facet	Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff
author_sort	Robert Gove
collection	DOAJ
description	We present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a new method to featurize data sets using graph-based metrics called scagnostics; we use these features to train a neural network that predicts optimal t-SNE hyperparameters for the respective data set. This neural network has the potential to simplify the use of t-SNE by removing guesswork about which hyperparameters will produce the best embedding. We evaluate and compare our neural network-derived and empirically optimum hyperparameters to several other t-SNE hyperparameter guidelines from the literature on 68 data sets. The hyperparameters predicted by our neural network yield embeddings with similar accuracy as the best current t-SNE guidelines. Using our empirically optimum hyperparameters is simpler than following previously published guidelines but yields more accurate embeddings, in some cases by a statistically significant margin. We find that the useful ranges for t-SNE hyperparameters are narrower and include smaller values than previously reported in the literature. Importantly, we also quantify the potential for future improvements in this area: using data from a grid search of t-SNE hyperparameters we find that an optimal selection method could improve embedding accuracy by up to two percentage points over the methods examined in this paper.
first_indexed	2024-12-10T20:08:32Z
format	Article
id	doaj.art-594babfbfc27491ca8f51a4869de6823
institution	Directory Open Access Journal
issn	2468-502X
language	English
last_indexed	2024-12-10T20:08:32Z
publishDate	2022-06-01
publisher	Elsevier
record_format	Article
series	Visual Informatics
spelling	doaj.art-594babfbfc27491ca8f51a4869de68232022-12-22T01:35:20ZengElsevierVisual Informatics2468-502X2022-06-01628797New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluationRobert Gove0Lucas Cadalzo1Nicholas Leiby2Jedediah M. Singer3Alexander Zaitzeff4Corresponding author.; Two Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USATwo Six Technologies, USAWe present new guidelines for choosing hyperparameters for t-SNE and an evaluation comparing these guidelines to current ones. These guidelines include a proposed empirically optimum guideline derived from a t-SNE hyperparameter grid search over a large collection of data sets. We also introduce a new method to featurize data sets using graph-based metrics called scagnostics; we use these features to train a neural network that predicts optimal t-SNE hyperparameters for the respective data set. This neural network has the potential to simplify the use of t-SNE by removing guesswork about which hyperparameters will produce the best embedding. We evaluate and compare our neural network-derived and empirically optimum hyperparameters to several other t-SNE hyperparameter guidelines from the literature on 68 data sets. The hyperparameters predicted by our neural network yield embeddings with similar accuracy as the best current t-SNE guidelines. Using our empirically optimum hyperparameters is simpler than following previously published guidelines but yields more accurate embeddings, in some cases by a statistically significant margin. We find that the useful ranges for t-SNE hyperparameters are narrower and include smaller values than previously reported in the literature. Importantly, we also quantify the potential for future improvements in this area: using data from a grid search of t-SNE hyperparameters we find that an optimal selection method could improve embedding accuracy by up to two percentage points over the methods examined in this paper.http://www.sciencedirect.com/science/article/pii/S2468502X22000201Dimensionality reductionMachine learningt-SNE
spellingShingle	Robert Gove Lucas Cadalzo Nicholas Leiby Jedediah M. Singer Alexander Zaitzeff New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation Visual Informatics Dimensionality reduction Machine learning t-SNE
title	New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
title_full	New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
title_fullStr	New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
title_full_unstemmed	New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
title_short	New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation
title_sort	new guidance for using t sne alternative defaults hyperparameter selection automation and comparative evaluation
topic	Dimensionality reduction Machine learning t-SNE
url	http://www.sciencedirect.com/science/article/pii/S2468502X22000201
work_keys_str_mv	AT robertgove newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT lucascadalzo newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT nicholasleiby newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT jedediahmsinger newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation AT alexanderzaitzeff newguidanceforusingtsnealternativedefaultshyperparameterselectionautomationandcomparativeevaluation

New guidance for using t-SNE: Alternative defaults, hyperparameter selection automation, and comparative evaluation

Similar Items