Covariance Matrix Preparation for Quantum Principal Component Analysis

Principal component analysis (PCA) is a dimensionality reduction method in data analysis that involves diagonalizing the covariance matrix of the dataset. Recently, quantum algorithms have been formulated for PCA based on diagonalizing a density matrix. These algorithms assume that the covariance ma...

Full description

Bibliographic Details
Main Authors:	Max Hunter Gordon, M. Cerezo, Lukasz Cincio, Patrick J. Coles
Format:	Article
Language:	English
Published:	American Physical Society 2022-09-01
Series:	PRX Quantum
Online Access:	http://doi.org/10.1103/PRXQuantum.3.030334

_version_	1797998726909788160
author	Max Hunter Gordon M. Cerezo Lukasz Cincio Patrick J. Coles
author_facet	Max Hunter Gordon M. Cerezo Lukasz Cincio Patrick J. Coles
author_sort	Max Hunter Gordon
collection	DOAJ
description	Principal component analysis (PCA) is a dimensionality reduction method in data analysis that involves diagonalizing the covariance matrix of the dataset. Recently, quantum algorithms have been formulated for PCA based on diagonalizing a density matrix. These algorithms assume that the covariance matrix can be encoded in a density matrix, but a concrete protocol for this encoding has been lacking. Our work aims to address this gap. Assuming amplitude encoding of the data, with the data given by the ensemble {p_{i},\|ψ_{i}⟩}, then one can easily prepare the ensemble average density matrix ρ[over ¯]=[under ∑]ip_{i}\|ψ_{i}⟩⟨ψ_{i}\|. We first show that ρ[over ¯] is precisely the covariance matrix whenever the dataset is centered. For quantum datasets, we exploit global phase symmetry to argue that there always exists a centered dataset consistent with ρ[over ¯], and hence ρ[over ¯] can always be interpreted as a covariance matrix. This provides a simple means for preparing the covariance matrix for arbitrary quantum datasets or centered classical datasets. For uncentered classical datasets, our method is so-called “PCA without centering,” which we interpret as PCA on a symmetrized dataset. We argue that this closely corresponds to standard PCA, and we derive equations and inequalities that bound the deviation of the spectrum obtained with our method from that of standard PCA. We numerically illustrate our method for the Modified National Institute of Standards and Technology (MNIST) handwritten digit dataset. We also argue that PCA on quantum datasets is natural and meaningful, and we numerically implement our method for molecular ground-state datasets.
first_indexed	2024-04-11T10:53:19Z
format	Article
id	doaj.art-7a92e1fe01a64f758df7eaf52ab15e2f
institution	Directory Open Access Journal
issn	2691-3399
language	English
last_indexed	2024-04-11T10:53:19Z
publishDate	2022-09-01
publisher	American Physical Society
record_format	Article
series	PRX Quantum
spelling	doaj.art-7a92e1fe01a64f758df7eaf52ab15e2f2022-12-22T04:28:51ZengAmerican Physical SocietyPRX Quantum2691-33992022-09-013303033410.1103/PRXQuantum.3.030334Covariance Matrix Preparation for Quantum Principal Component AnalysisMax Hunter GordonM. CerezoLukasz CincioPatrick J. ColesPrincipal component analysis (PCA) is a dimensionality reduction method in data analysis that involves diagonalizing the covariance matrix of the dataset. Recently, quantum algorithms have been formulated for PCA based on diagonalizing a density matrix. These algorithms assume that the covariance matrix can be encoded in a density matrix, but a concrete protocol for this encoding has been lacking. Our work aims to address this gap. Assuming amplitude encoding of the data, with the data given by the ensemble {p_{i},\|ψ_{i}⟩}, then one can easily prepare the ensemble average density matrix ρ[over ¯]=[under ∑]ip_{i}\|ψ_{i}⟩⟨ψ_{i}\|. We first show that ρ[over ¯] is precisely the covariance matrix whenever the dataset is centered. For quantum datasets, we exploit global phase symmetry to argue that there always exists a centered dataset consistent with ρ[over ¯], and hence ρ[over ¯] can always be interpreted as a covariance matrix. This provides a simple means for preparing the covariance matrix for arbitrary quantum datasets or centered classical datasets. For uncentered classical datasets, our method is so-called “PCA without centering,” which we interpret as PCA on a symmetrized dataset. We argue that this closely corresponds to standard PCA, and we derive equations and inequalities that bound the deviation of the spectrum obtained with our method from that of standard PCA. We numerically illustrate our method for the Modified National Institute of Standards and Technology (MNIST) handwritten digit dataset. We also argue that PCA on quantum datasets is natural and meaningful, and we numerically implement our method for molecular ground-state datasets.http://doi.org/10.1103/PRXQuantum.3.030334
spellingShingle	Max Hunter Gordon M. Cerezo Lukasz Cincio Patrick J. Coles Covariance Matrix Preparation for Quantum Principal Component Analysis PRX Quantum
title	Covariance Matrix Preparation for Quantum Principal Component Analysis
title_full	Covariance Matrix Preparation for Quantum Principal Component Analysis
title_fullStr	Covariance Matrix Preparation for Quantum Principal Component Analysis
title_full_unstemmed	Covariance Matrix Preparation for Quantum Principal Component Analysis
title_short	Covariance Matrix Preparation for Quantum Principal Component Analysis
title_sort	covariance matrix preparation for quantum principal component analysis
url	http://doi.org/10.1103/PRXQuantum.3.030334
work_keys_str_mv	AT maxhuntergordon covariancematrixpreparationforquantumprincipalcomponentanalysis AT mcerezo covariancematrixpreparationforquantumprincipalcomponentanalysis AT lukaszcincio covariancematrixpreparationforquantumprincipalcomponentanalysis AT patrickjcoles covariancematrixpreparationforquantumprincipalcomponentanalysis

Covariance Matrix Preparation for Quantum Principal Component Analysis

Similar Items