Towards provably efficient quantum algorithms for large-scale machine-learning models

Abstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly pr...

Full description

Bibliographic Details
Main Authors: Junyu Liu, Minzhao Liu, Jin-Peng Liu, Ziyu Ye, Yunfei Wang, Yuri Alexeev, Jens Eisert, Liang Jiang
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-023-43957-x
_version_ 1797355628045271040
author Junyu Liu
Minzhao Liu
Jin-Peng Liu
Ziyu Ye
Yunfei Wang
Yuri Alexeev
Jens Eisert
Liang Jiang
author_facet Junyu Liu
Minzhao Liu
Jin-Peng Liu
Ziyu Ye
Yunfei Wang
Yuri Alexeev
Jens Eisert
Liang Jiang
author_sort Junyu Liu
collection DOAJ
description Abstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as $${{{{{{{\mathcal{O}}}}}}}}({T}^{2}\times {{{{{{{\rm{polylog}}}}}}}}(n))$$ O ( T 2 × polylog ( n ) ) , where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.
first_indexed 2024-03-08T14:14:53Z
format Article
id doaj.art-280da3c96c15499cb7a5adc603d1bc07
institution Directory Open Access Journal
issn 2041-1723
language English
last_indexed 2024-03-08T14:14:53Z
publishDate 2024-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj.art-280da3c96c15499cb7a5adc603d1bc072024-01-14T12:28:10ZengNature PortfolioNature Communications2041-17232024-01-011511610.1038/s41467-023-43957-xTowards provably efficient quantum algorithms for large-scale machine-learning modelsJunyu Liu0Minzhao Liu1Jin-Peng Liu2Ziyu Ye3Yunfei Wang4Yuri Alexeev5Jens Eisert6Liang Jiang7Pritzker School of Molecular Engineering, The University of ChicagoDepartment of Physics, The University of ChicagoSimons Institute for the Theory of Computing, University of CaliforniaDepartment of Computer Science, The University of ChicagoMartin A. Fisher School of Physics, Brandeis UniversityDepartment of Computer Science, The University of ChicagoDahlem Center for Complex Quantum Systems, Free University BerlinPritzker School of Molecular Engineering, The University of ChicagoAbstract Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as $${{{{{{{\mathcal{O}}}}}}}}({T}^{2}\times {{{{{{{\rm{polylog}}}}}}}}(n))$$ O ( T 2 × polylog ( n ) ) , where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.https://doi.org/10.1038/s41467-023-43957-x
spellingShingle Junyu Liu
Minzhao Liu
Jin-Peng Liu
Ziyu Ye
Yunfei Wang
Yuri Alexeev
Jens Eisert
Liang Jiang
Towards provably efficient quantum algorithms for large-scale machine-learning models
Nature Communications
title Towards provably efficient quantum algorithms for large-scale machine-learning models
title_full Towards provably efficient quantum algorithms for large-scale machine-learning models
title_fullStr Towards provably efficient quantum algorithms for large-scale machine-learning models
title_full_unstemmed Towards provably efficient quantum algorithms for large-scale machine-learning models
title_short Towards provably efficient quantum algorithms for large-scale machine-learning models
title_sort towards provably efficient quantum algorithms for large scale machine learning models
url https://doi.org/10.1038/s41467-023-43957-x
work_keys_str_mv AT junyuliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT minzhaoliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT jinpengliu towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT ziyuye towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT yunfeiwang towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT yurialexeev towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT jenseisert towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels
AT liangjiang towardsprovablyefficientquantumalgorithmsforlargescalemachinelearningmodels