On the Complexity of Neural Computation in Superposition

Recent advances in the understanding of neural networks suggest that superposition, the ability of a single neuron to represent multiple features simultaneously, is a key mechanism underlying the computational efficiency of large-scale networks. This paper explores the theoretical foundations of com...

Full description

Bibliographic Details
Main Authors:	Adler, Micah, Shavit, Nir
Format:	Article
Language:	en_US
Published:	2024
Subjects:	superposition neural network neurons complexity
Online Access:	https://hdl.handle.net/1721.1/157073

_version_	1824458165357379584
author	Adler, Micah Shavit, Nir
author_facet	Adler, Micah Shavit, Nir
author_sort	Adler, Micah
collection	MIT
description	Recent advances in the understanding of neural networks suggest that superposition, the ability of a single neuron to represent multiple features simultaneously, is a key mechanism underlying the computational efficiency of large-scale networks. This paper explores the theoretical foundations of computing in superposition, focusing on explicit, provably correct algorithms and their efficiency. We present the first lower bounds showing that for a broad class of problems, including permutations and pairwise logical operations, a neural net- work computing in superposition requires at least Ω(m′ log m′) parameters and Ω(√(m′ log m′)) neurons, where m′ is the number of output features being computed. This implies that any “lottery ticket” sparse sub-network must have at least Ω(m′ log m′ ) parameters no matter what the initial dense network size. Conversely, we show a nearly tight upper bound: logical operations like pair- wise AND can be computed using O(√(m′) log m′) neurons and O(m′ log^2 m′) parameters. There is thus an exponential gap between computing in superposition, the subject of this work, and representing features in superposition, which can require as little as O(log m′) neurons based on the Johnson-Lindenstrauss Lemma. Our hope is that our results open a path for using complexity theoretic techniques in neural network interpretability research.
first_indexed	2025-02-19T04:21:33Z
format	Article
id	mit-1721.1/157073
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2025-02-19T04:21:33Z
publishDate	2024
record_format	dspace
spelling	mit-1721.1/1570732024-10-01T03:33:22Z On the Complexity of Neural Computation in Superposition Adler, Micah Shavit, Nir superposition neural network neurons complexity Recent advances in the understanding of neural networks suggest that superposition, the ability of a single neuron to represent multiple features simultaneously, is a key mechanism underlying the computational efficiency of large-scale networks. This paper explores the theoretical foundations of computing in superposition, focusing on explicit, provably correct algorithms and their efficiency. We present the first lower bounds showing that for a broad class of problems, including permutations and pairwise logical operations, a neural net- work computing in superposition requires at least Ω(m′ log m′) parameters and Ω(√(m′ log m′)) neurons, where m′ is the number of output features being computed. This implies that any “lottery ticket” sparse sub-network must have at least Ω(m′ log m′ ) parameters no matter what the initial dense network size. Conversely, we show a nearly tight upper bound: logical operations like pair- wise AND can be computed using O(√(m′) log m′) neurons and O(m′ log^2 m′) parameters. There is thus an exponential gap between computing in superposition, the subject of this work, and representing features in superposition, which can require as little as O(log m′) neurons based on the Johnson-Lindenstrauss Lemma. Our hope is that our results open a path for using complexity theoretic techniques in neural network interpretability research. 2024-09-30T15:49:30Z 2024-09-30T15:49:30Z 2024-09-30 Article https://hdl.handle.net/1721.1/157073 en_US Attribution-NonCommercial-NoDerivs 3.0 United States http://creativecommons.org/licenses/by-nc-nd/3.0/us/ application/pdf
spellingShingle	superposition neural network neurons complexity Adler, Micah Shavit, Nir On the Complexity of Neural Computation in Superposition
title	On the Complexity of Neural Computation in Superposition
title_full	On the Complexity of Neural Computation in Superposition
title_fullStr	On the Complexity of Neural Computation in Superposition
title_full_unstemmed	On the Complexity of Neural Computation in Superposition
title_short	On the Complexity of Neural Computation in Superposition
title_sort	on the complexity of neural computation in superposition
topic	superposition neural network neurons complexity
url	https://hdl.handle.net/1721.1/157073
work_keys_str_mv	AT adlermicah onthecomplexityofneuralcomputationinsuperposition AT shavitnir onthecomplexityofneuralcomputationinsuperposition

On the Complexity of Neural Computation in Superposition

Similar Items