GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts

The next generation of cosmology experiments will be required to use photometric redshifts rather than spectroscopic redshifts. Obtaining accurate and well-characterized photometric redshift distributions is therefore critical for Euclid, the Large Synoptic Survey Telescope and the Square Kilometre...

Deskribapen osoa

Xehetasun bibliografikoak
Egile Nagusiak: Almosallam, I, Jarvis, M, Roberts, S
Formatua: Journal article
Argitaratua: Oxford University Press 2016
_version_ 1826263730161188864
author Almosallam, I
Jarvis, M
Roberts, S
author_facet Almosallam, I
Jarvis, M
Roberts, S
author_sort Almosallam, I
collection OXFORD
description The next generation of cosmology experiments will be required to use photometric redshifts rather than spectroscopic redshifts. Obtaining accurate and well-characterized photometric redshift distributions is therefore critical for Euclid, the Large Synoptic Survey Telescope and the Square Kilometre Array. However, determining accurate variance predictions alongside single point estimates is crucial, as they can be used to optimize the sample of galaxies for the specific experiment (e.g. weak lensing, baryon acoustic oscillations, supernovae), trading off between completeness and reliability in the galaxy sample. The various sources of uncertainty in measurements of the photometry and redshifts put a lower bound on the accuracy that any model can hope to achieve. The intrinsic uncertainty associated with estimates is often non-uniform and input-dependent, commonly known in statistics as heteroscedastic noise. However, existing approaches are susceptible to outliers and do not take into account variance induced by non-uniform data density and in most cases require manual tuning of many parameters. In this paper, we present a Bayesian machine learning approach that jointly optimizes the model with respect to both the predictive mean and variance we refer to as Gaussian processes for photometric redshifts (GPZ). The predictive variance of the model takes into account both the variance due to data density and photometric noise. Using the Sloan Digital Sky Survey (SDSS) DR12 data, we show that our approach substantially outperforms other machine learning methods for photo-z estimation and their associated variance, such as TPZ and ANNZ2. We provide a MATLAB and PYTHON implementations that are available to download at https://github.com/OxfordML/GPz.
first_indexed 2024-03-06T19:56:27Z
format Journal article
id oxford-uuid:25c4da8d-11a7-4de6-b986-b2b9af78f17b
institution University of Oxford
last_indexed 2024-03-06T19:56:27Z
publishDate 2016
publisher Oxford University Press
record_format dspace
spelling oxford-uuid:25c4da8d-11a7-4de6-b986-b2b9af78f17b2022-03-26T11:57:25ZGPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshiftsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:25c4da8d-11a7-4de6-b986-b2b9af78f17bSymplectic Elements at OxfordOxford University Press2016Almosallam, IJarvis, MRoberts, SThe next generation of cosmology experiments will be required to use photometric redshifts rather than spectroscopic redshifts. Obtaining accurate and well-characterized photometric redshift distributions is therefore critical for Euclid, the Large Synoptic Survey Telescope and the Square Kilometre Array. However, determining accurate variance predictions alongside single point estimates is crucial, as they can be used to optimize the sample of galaxies for the specific experiment (e.g. weak lensing, baryon acoustic oscillations, supernovae), trading off between completeness and reliability in the galaxy sample. The various sources of uncertainty in measurements of the photometry and redshifts put a lower bound on the accuracy that any model can hope to achieve. The intrinsic uncertainty associated with estimates is often non-uniform and input-dependent, commonly known in statistics as heteroscedastic noise. However, existing approaches are susceptible to outliers and do not take into account variance induced by non-uniform data density and in most cases require manual tuning of many parameters. In this paper, we present a Bayesian machine learning approach that jointly optimizes the model with respect to both the predictive mean and variance we refer to as Gaussian processes for photometric redshifts (GPZ). The predictive variance of the model takes into account both the variance due to data density and photometric noise. Using the Sloan Digital Sky Survey (SDSS) DR12 data, we show that our approach substantially outperforms other machine learning methods for photo-z estimation and their associated variance, such as TPZ and ANNZ2. We provide a MATLAB and PYTHON implementations that are available to download at https://github.com/OxfordML/GPz.
spellingShingle Almosallam, I
Jarvis, M
Roberts, S
GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title_full GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title_fullStr GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title_full_unstemmed GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title_short GPz: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
title_sort gpz non stationary sparse gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
work_keys_str_mv AT almosallami gpznonstationarysparsegaussianprocessesforheteroscedasticuncertaintyestimationinphotometricredshifts
AT jarvism gpznonstationarysparsegaussianprocessesforheteroscedasticuncertaintyestimationinphotometricredshifts
AT robertss gpznonstationarysparsegaussianprocessesforheteroscedasticuncertaintyestimationinphotometricredshifts