Linear complexity self-attention with 3rd order polynomials

Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their...

Full description

Bibliographic Details
Main Authors:	Babiloni, F, Marras, I, Deng, J, Kokkinos, F, Maggioni, M, Chrysos, G, Torr, P, Zafeiriou, S
Format:	Journal article
Language:	English
Published:	IEEE 2023

_version_	1826311990684942336
author	Babiloni, F Marras, I Deng, J Kokkinos, F Maggioni, M Chrysos, G Torr, P Zafeiriou, S
author_facet	Babiloni, F Marras, I Deng, J Kokkinos, F Maggioni, M Chrysos, G Torr, P Zafeiriou, S
author_sort	Babiloni, F
collection	OXFORD
description	Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their use impractical in many real case applications. In this work, we analyze these methods through a polynomial lens, and we show that self-attention can be seen as a special case of a 3 rd order polynomial. Within this polynomial framework, we are able to design polynomial operators capable of accessing the same data pattern of non-local and self-attention blocks while reducing the complexity from quadratic to linear. As a result, we propose two modules (Poly-NL and Poly-SA) that can be used as “drop-in” replacements for more-complex non-local and self-attention layers in state-of-the-art CNNs and ViT architectures. Our modules can achieve comparable, if not better, performance across a wide range of computer vision tasks while keeping a complexity equivalent to a standard linear layer.
first_indexed	2024-03-07T08:19:24Z
format	Journal article
id	oxford-uuid:ee0acf37-70f2-43b8-b5e2-2171881254fe
institution	University of Oxford
language	English
last_indexed	2024-03-07T08:19:24Z
publishDate	2023
publisher	IEEE
record_format	dspace
spelling	oxford-uuid:ee0acf37-70f2-43b8-b5e2-2171881254fe2024-01-16T14:33:26ZLinear complexity self-attention with 3rd order polynomialsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:ee0acf37-70f2-43b8-b5e2-2171881254feEnglishSymplectic ElementsIEEE2023Babiloni, FMarras, IDeng, JKokkinos, FMaggioni, MChrysos, GTorr, PZafeiriou, SSelf-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their use impractical in many real case applications. In this work, we analyze these methods through a polynomial lens, and we show that self-attention can be seen as a special case of a 3 rd order polynomial. Within this polynomial framework, we are able to design polynomial operators capable of accessing the same data pattern of non-local and self-attention blocks while reducing the complexity from quadratic to linear. As a result, we propose two modules (Poly-NL and Poly-SA) that can be used as “drop-in” replacements for more-complex non-local and self-attention layers in state-of-the-art CNNs and ViT architectures. Our modules can achieve comparable, if not better, performance across a wide range of computer vision tasks while keeping a complexity equivalent to a standard linear layer.
spellingShingle	Babiloni, F Marras, I Deng, J Kokkinos, F Maggioni, M Chrysos, G Torr, P Zafeiriou, S Linear complexity self-attention with 3rd order polynomials
title	Linear complexity self-attention with 3rd order polynomials
title_full	Linear complexity self-attention with 3rd order polynomials
title_fullStr	Linear complexity self-attention with 3rd order polynomials
title_full_unstemmed	Linear complexity self-attention with 3rd order polynomials
title_short	Linear complexity self-attention with 3rd order polynomials
title_sort	linear complexity self attention with 3rd order polynomials
work_keys_str_mv	AT babilonif linearcomplexityselfattentionwith3rdorderpolynomials AT marrasi linearcomplexityselfattentionwith3rdorderpolynomials AT dengj linearcomplexityselfattentionwith3rdorderpolynomials AT kokkinosf linearcomplexityselfattentionwith3rdorderpolynomials AT maggionim linearcomplexityselfattentionwith3rdorderpolynomials AT chrysosg linearcomplexityselfattentionwith3rdorderpolynomials AT torrp linearcomplexityselfattentionwith3rdorderpolynomials AT zafeirious linearcomplexityselfattentionwith3rdorderpolynomials

Linear complexity self-attention with 3rd order polynomials

Similar Items