Self-attention in vision transformers performs perceptual grouping, not attention

Recently, a considerable number of studies in computer vision involve deep neural architectures called vision transformers. Visual processing in these models incorporates computational models that are claimed to implement attention mechanisms. Despite an increasing body of work that attempts to unde...

Full description

Bibliographic Details
Main Authors: Paria Mehrani, John K. Tsotsos
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-06-01
Series:Frontiers in Computer Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcomp.2023.1178450/full