Redesigning Embedding Layers for Queries, Keys, and Values in Cross-Covariance Image Transformers

There are several attempts in vision transformers to reduce quadratic time complexity to linear time complexity according to increases in the number of tokens. Cross-covariance image transformers (XCiT) are also one of the techniques utilized to address the issue. However, despite these efforts, the...

Full description

Bibliographic Details
Main Authors:	Jaesin Ahn , Jiuk Hong, Jeongwoo Ju, Heechul Jung
Format:	Article
Language:	English
Published:	MDPI AG 2023-04-01
Series:	Mathematics
Subjects:	vision transformer Q/K/V embedding shared embedding non-linear embedding image classification
Online Access:	https://www.mdpi.com/2227-7390/11/8/1933

Internet

https://www.mdpi.com/2227-7390/11/8/1933

Redesigning Embedding Layers for Queries, Keys, and Values in Cross-Covariance Image Transformers

Internet

Similar Items