Open vocabulary semantic segmentation with Patch Aligned Contrastive Learning
We introduce Patch Aligned Contrastive Learning (PACL), a modified compatibility function for CLIP's contrastive loss, intending to train an alignment between the patch tokens of the vision encoder and the CLS token of the text encoder. With such an alignment, a model can identify regions of an...
المؤلفون الرئيسيون: | , , , , , , |
---|---|
التنسيق: | Conference item |
اللغة: | English |
منشور في: |
IEEE
2023
|