uCAP: an unsupervised prompting method for vision-language models
This paper addresses a significant limitation that prevents Contrastive Language-Image Pretrained Models (CLIP) from achieving optimal performance on downstream image classification tasks. The key problem with CLIP-style zero-shot classification is that it requires domain-specific context in the for...
Հիմնական հեղինակներ: | , , , , , , , |
---|---|
Ձևաչափ: | Conference item |
Լեզու: | English |
Հրապարակվել է: |
Springer
2024
|