Cocktail: mixing multi-modality controls for text-conditional image generation
Text-conditional diffusion models are able to generate high-fidelity images with diverse contents. However, linguistic representations frequently exhibit ambiguous descriptions of the envisioned objective imagery, requiring the incorporation of additional control signals to bolster the efficacy of t...
Main Authors: | Hu, Minghui, Zheng, Jianbin, Liu, Daqing, Zheng, Chuanxia, Wang, Chaoyue, Tao, Dacheng, Cham, Tat-Jen |
---|---|
Other Authors: | School of Computer Science and Engineering |
Format: | Conference Paper |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172668 https://nips.cc/virtual/2023/calendar |
Similar Items
-
Half-body portrait relighting with overcomplete lighting representation
by: Song, Guoxian, et al.
Published: (2023) -
Objective quality assessment and perceptual compression of screen content images
by: Wang, Shiqi, et al.
Published: (2020) -
Structure-aware generation network for recipe generation from images
by: Wang, Hao, et al.
Published: (2021) -
Detection of computer graphics using attention-based dual-branch convolutional neural network from fused color components
by: He, Peisong, et al.
Published: (2021) -
Manganese doped fluorescent paramagnetic nanocrystals for dual-modal imaging
by: Sharma, Vijay Kumar, et al.
Published: (2015)