Tensor network to learn the wave function of data
Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination)...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Physical Society
2022-11-01
|
Series: | Physical Review Research |
Online Access: | http://doi.org/10.1103/PhysRevResearch.4.043111 |
_version_ | 1797210590473617408 |
---|---|
author | Anatoly Dymarsky Kirill Pavlenko |
author_facet | Anatoly Dymarsky Kirill Pavlenko |
author_sort | Anatoly Dymarsky |
collection | DOAJ |
description | Tensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8. |
first_indexed | 2024-04-24T10:13:01Z |
format | Article |
id | doaj.art-86112bfe75ad4cd8a1a935f8822e82a9 |
institution | Directory Open Access Journal |
issn | 2643-1564 |
language | English |
last_indexed | 2024-04-24T10:13:01Z |
publishDate | 2022-11-01 |
publisher | American Physical Society |
record_format | Article |
series | Physical Review Research |
spelling | doaj.art-86112bfe75ad4cd8a1a935f8822e82a92024-04-12T17:26:13ZengAmerican Physical SocietyPhysical Review Research2643-15642022-11-014404311110.1103/PhysRevResearch.4.043111Tensor network to learn the wave function of dataAnatoly DymarskyKirill PavlenkoTensor network architectures have emerged recently as a promising approach to various tasks of machine learning, both supervised and unsupervised. In this work we introduce a matrix product state-based network that simultaneously accomplishes the following two tasks: classification (discrimination) and sampling of visual data. We train the network using binary (black and white) version of MNIST, a data set of handwritten digits, to recognize as well as to sample images of a particular digit. We show our trained network is qualitatively representing the indicator function of the “full set” of all possible images of a given format depicting the particular digit. While the notion of the full set is difficult to define from the first principles, our construction provides a working definition, and we show that different ways to build and train the network lead to similar results. We emphasize, this means the trained network learns the “wave function of data,” i.e., can be used to characterize the data itself, providing a novel tool to study global properties of the data sets of interest. First, using quantum mechanical interpretation we characterize the full set by calculating its entanglement entropy. Then we study its geometric properties such as mean Hamming distance, effective dimension, and size. The latter is the total number of images in binary black and white MNIST format which would be recognized as depicting a particular digit. Alternatively, it is the number of images of a given digit one would need to sample before the probability of sampling the same image twice would be of order one. While this number cannot be defined completely rigorously, we show its logarithm is largely independent of the way the network is defined and trained. We find that for different digits this number varies dramatically, from 2^{22} for digit 1 to 2^{92} for digit 8.http://doi.org/10.1103/PhysRevResearch.4.043111 |
spellingShingle | Anatoly Dymarsky Kirill Pavlenko Tensor network to learn the wave function of data Physical Review Research |
title | Tensor network to learn the wave function of data |
title_full | Tensor network to learn the wave function of data |
title_fullStr | Tensor network to learn the wave function of data |
title_full_unstemmed | Tensor network to learn the wave function of data |
title_short | Tensor network to learn the wave function of data |
title_sort | tensor network to learn the wave function of data |
url | http://doi.org/10.1103/PhysRevResearch.4.043111 |
work_keys_str_mv | AT anatolydymarsky tensornetworktolearnthewavefunctionofdata AT kirillpavlenko tensornetworktolearnthewavefunctionofdata |