Tensor network to learn the wave function of data

Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination)...

Full description

Bibliographic Details
Main Authors:	Anatoly Dymarsky, Kirill Pavlenko
Format:	Article
Language:	English
Published:	American Physical Society 2022-11-01
Series:	Physical Review Research
Online Access:	http://doi.org/10.1103/PhysRevResearch.4.043111

_version_	1797210590473617408
author	Anatoly Dymarsky Kirill Pavlenko
author_facet	Anatoly Dymarsky Kirill Pavlenko
author_sort	Anatoly Dymarsky
collection	DOAJ
description	Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8.
first_indexed	2024-04-24T10:13:01Z
format	Article
id	doaj.art-86112bfe75ad4cd8a1a935f8822e82a9
institution	Directory Open Access Journal
issn	2643-1564
language	English
last_indexed	2024-04-24T10:13:01Z
publishDate	2022-11-01
publisher	American Physical Society
record_format	Article
series	Physical Review Research
spelling	doaj.art-86112bfe75ad4cd8a1a935f8822e82a92024-04-12T17:26:13ZengAmerican Physical SocietyPhysical Review Research2643-15642022-11-014404311110.1103/PhysRevResearch.4.043111Tensor network to learn the wave function of dataAnatoly DymarskyKirill PavlenkoTensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8.http://doi.org/10.1103/PhysRevResearch.4.043111
spellingShingle	Anatoly Dymarsky Kirill Pavlenko Tensor network to learn the wave function of data Physical Review Research
title	Tensor network to learn the wave function of data
title_full	Tensor network to learn the wave function of data
title_fullStr	Tensor network to learn the wave function of data
title_full_unstemmed	Tensor network to learn the wave function of data
title_short	Tensor network to learn the wave function of data
title_sort	tensor network to learn the wave function of data
url	http://doi.org/10.1103/PhysRevResearch.4.043111
work_keys_str_mv	AT anatolydymarsky tensornetworktolearnthewavefunctionofdata AT kirillpavlenko tensornetworktolearnthewavefunctionofdata

Tensor network to learn the wave function of data

Similar Items