Cocktail: mixing multi-modality controls for text-conditional image generation

Text-conditional diffusion models are able to generate high-fidelity images with diverse contents. However, linguistic representations frequently exhibit ambiguous descriptions of the envisioned objective imagery, requiring the incorporation of additional control signals to bolster the efficacy of t...

Full description

Bibliographic Details
Main Authors: Hu, Minghui, Zheng, Jianbin, Liu, Daqing, Zheng, Chuanxia, Wang, Chaoyue, Tao, Dacheng, Cham, Tat-Jen
Other Authors: School of Computer Science and Engineering
Format: Conference Paper
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/172668
https://nips.cc/virtual/2023/calendar