Towards provably efficient quantum algorithms for large-scale machine-learning models
Abstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly pr...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-01-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-023-43957-x |
_version_ | 1797355628045271040 |
---|---|
author | Junyu Liu Minzhao Liu Jin-Peng Liu Ziyu Ye Yunfei Wang Yuri Alexeev Jens Eisert Liang Jiang |
author_facet | Junyu Liu Minzhao Liu Jin-Peng Liu Ziyu Ye Yunfei Wang Yuri Alexeev Jens Eisert Liang Jiang |
author_sort | Junyu Liu |
collection | DOAJ |
description | Abstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as $${{{{{{{\mathcal{O}}}}}}}}({T}^{2}\times {{{{{{{\rm{polylog}}}}}}}}(n))$$ O ( T 2 × polylog ( n ) ) , where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems. |
first_indexed | 2024-03-08T14:14:53Z |
format | Article |
id | doaj.art-280da3c96c15499cb7a5adc603d1bc07 |
institution | Directory Open Access Journal |
issn | 2041-1723 |
language | English |
last_indexed | 2024-03-08T14:14:53Z |
publishDate | 2024-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj.art-280da3c96c15499cb7a5adc603d1bc072024-01-14T12:28:10ZengNature PortfolioNature Communications2041-17232024-01-011511610.1038/s41467-023-43957-xTowards provably efficient quantum algorithms for large-scale machine-learning modelsJunyu Liu0Minzhao Liu1Jin-Peng Liu2Ziyu Ye3Yunfei Wang4Yuri Alexeev5Jens Eisert6Liang Jiang7Pritzker School of Molecular Engineering, The University of ChicagoDepartment of Physics, The University of ChicagoSimons Institute for the Theory of Computing, University of CaliforniaDepartment of Computer Science, The University of ChicagoMartin A. Fisher School of Physics, Brandeis UniversityDepartment of Computer Science, The University of ChicagoDahlem Center for Complex Quantum Systems, Free University BerlinPritzker School of Molecular Engineering, The University of ChicagoAbstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as $${{{{{{{\mathcal{O}}}}}}}}({T}^{2}\times {{{{{{{\rm{polylog}}}}}}}}(n))$$ O ( T 2 × polylog ( n ) ) , where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.https://doi.org/10.1038/s41467-023-43957-x |
spellingShingle | Junyu Liu Minzhao Liu Jin-Peng Liu Ziyu Ye Yunfei Wang Yuri Alexeev Jens Eisert Liang Jiang Towards provably efficient quantum algorithms for large-scale machine-learning models Nature Communications |
title | Towards provably efficient quantum algorithms for large-scale machine-learning models |
title_full | Towards provably efficient quantum algorithms for large-scale machine-learning models |
title_fullStr | Towards provably efficient quantum algorithms for large-scale machine-learning models |
title_full_unstemmed | Towards provably efficient quantum algorithms for large-scale machine-learning models |
title_short | Towards provably efficient quantum algorithms for large-scale machine-learning models |
title_sort | towards provably efficient quantum algorithms for large scale machine learning models |
url | https://doi.org/10.1038/s41467-023-43957-x |
work_keys_str_mv | AT junyuliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT minzhaoliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT jinpengliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT ziyuye towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT yunfeiwang towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT yurialexeev towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT jenseisert towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels AT liangjiang towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels |