uCAP: an unsupervised prompting method for vision-language models

This paper addresses a significant limitation that prevents Contrastive Language-Image Pretrained Models (CLIP) from achieving optimal performance on downstream image classification tasks. The key problem with CLIP-style zero-shot classification is that it requires domain-specific context in the for...

Ամբողջական նկարագրություն

Մատենագիտական մանրամասներ
Հիմնական հեղինակներ:	Nguyen, AT, Tai, KS, Chen, BC, Shukla, SN, Yu, H, Torr, P, Tian, TP, Lim, SN
Ձևաչափ:	Conference item
Լեզու:	English
Հրապարակվել է:	Springer 2024

uCAP: an unsupervised prompting method for vision-language models

Նմանատիպ նյութեր