ManiGAN: Text-guided image manipulation

The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGA...

Full description

Bibliographic Details
Main Authors: Li, B, Qi, X, Lukasiewicz, T, Torr, PHS
Format: Conference item
Language:English
Published: IEEE 2020
_version_ 1797055974409764864
author Li, B
Qi, X
Lukasiewicz, T
Torr, PHS
author_facet Li, B
Qi, X
Lukasiewicz, T
Torr, PHS
author_sort Li, B
collection OXFORD
description The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method.
first_indexed 2024-03-06T19:16:57Z
format Conference item
id oxford-uuid:18bad4bb-24e5-4b19-84fa-557eebaca646
institution University of Oxford
language English
last_indexed 2024-03-06T19:16:57Z
publishDate 2020
publisher IEEE
record_format dspace
spelling oxford-uuid:18bad4bb-24e5-4b19-84fa-557eebaca6462022-03-26T10:44:47ZManiGAN: Text-guided image manipulationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:18bad4bb-24e5-4b19-84fa-557eebaca646EnglishSymplectic ElementsIEEE2020Li, BQi, XLukasiewicz, TTorr, PHSThe goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method.
spellingShingle Li, B
Qi, X
Lukasiewicz, T
Torr, PHS
ManiGAN: Text-guided image manipulation
title ManiGAN: Text-guided image manipulation
title_full ManiGAN: Text-guided image manipulation
title_fullStr ManiGAN: Text-guided image manipulation
title_full_unstemmed ManiGAN: Text-guided image manipulation
title_short ManiGAN: Text-guided image manipulation
title_sort manigan text guided image manipulation
work_keys_str_mv AT lib manigantextguidedimagemanipulation
AT qix manigantextguidedimagemanipulation
AT lukasiewiczt manigantextguidedimagemanipulation
AT torrphs manigantextguidedimagemanipulation