Tensor network to learn the wave function of data

Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination)...

Full description

Bibliographic Details
Main Authors: Anatoly Dymarsky, Kirill Pavlenko
Format: Article
Language:English
Published: American Physical Society 2022-11-01
Series:Physical Review Research
Online Access:http://doi.org/10.1103/PhysRevResearch.4.043111
_version_ 1797210590473617408
author Anatoly Dymarsky
Kirill Pavlenko
author_facet Anatoly Dymarsky
Kirill Pavlenko
author_sort Anatoly Dymarsky
collection DOAJ
description Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8.
first_indexed 2024-04-24T10:13:01Z
format Article
id doaj.art-86112bfe75ad4cd8a1a935f8822e82a9
institution Directory Open Access Journal
issn 2643-1564
language English
last_indexed 2024-04-24T10:13:01Z
publishDate 2022-11-01
publisher American Physical Society
record_format Article
series Physical Review Research
spelling doaj.art-86112bfe75ad4cd8a1a935f8822e82a92024-04-12T17:26:13ZengAmerican Physical SocietyPhysical Review Research2643-15642022-11-014404311110.1103/PhysRevResearch.4.043111Tensor network to learn the wave function of dataAnatoly DymarskyKirill PavlenkoTensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8.http://doi.org/10.1103/PhysRevResearch.4.043111
spellingShingle Anatoly Dymarsky
Kirill Pavlenko
Tensor network to learn the wave function of data
Physical Review Research
title Tensor network to learn the wave function of data
title_full Tensor network to learn the wave function of data
title_fullStr Tensor network to learn the wave function of data
title_full_unstemmed Tensor network to learn the wave function of data
title_short Tensor network to learn the wave function of data
title_sort tensor network to learn the wave function of data
url http://doi.org/10.1103/PhysRevResearch.4.043111
work_keys_str_mv AT anatolydymarsky tensornetworktolearnthewavefunctionofdata
AT kirillpavlenko tensornetworktolearnthewavefunctionofdata