Cloth interactive transformer for virtual try-on

The 2D image-based virtual try-on has aroused increased interest from the multimedia and computer vision fields due to its enormous commercial value. Nevertheless, most existing image-based virtual try-on approaches directly combine the person-identity representation and the in-shop clothing items w...

Full description

Bibliographic Details
Main Authors: Ren, B, Tang, H, Meng, F, Runwei, D, Torr, P, Sebe, N
Format: Journal article
Language:English
Published: Association for Computing Machinery 2023
_version_ 1826313819703476224
author Ren, B
Tang, H
Meng, F
Runwei, D
Torr, P
Sebe, N
author_facet Ren, B
Tang, H
Meng, F
Runwei, D
Torr, P
Sebe, N
author_sort Ren, B
collection OXFORD
description The 2D image-based virtual try-on has aroused increased interest from the multimedia and computer vision fields due to its enormous commercial value. Nevertheless, most existing image-based virtual try-on approaches directly combine the person-identity representation and the in-shop clothing items without taking their mutual correlations into consideration. Moreover, these methods are commonly established on pure convolutional neural networks (CNNs) architectures which are not simple to capture the long-range correlations among the input pixels. As a result, it generally results in inconsistent results. To alleviate these issues, in this article, we propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. During the first stage, we design a CIT matching block, aiming at precisely capturing the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. Consequently, it makes the warped in-shop clothing items look more natural in appearance. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask. The empirical results, based on mutual dependencies, demonstrate that the final try-on results are more realistic. Substantial empirical results on a public fashion dataset illustrate that the suggested CIT attains competitive virtual try-on performance.
first_indexed 2024-04-23T08:26:14Z
format Journal article
id oxford-uuid:b9373c00-12df-4a9f-86e3-e2618c9ec726
institution University of Oxford
language English
last_indexed 2024-09-25T04:22:34Z
publishDate 2023
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:b9373c00-12df-4a9f-86e3-e2618c9ec7262024-08-12T09:34:45ZCloth interactive transformer for virtual try-onJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:b9373c00-12df-4a9f-86e3-e2618c9ec726EnglishSymplectic ElementsAssociation for Computing Machinery2023Ren, BTang, HMeng, FRunwei, DTorr, PSebe, NThe 2D image-based virtual try-on has aroused increased interest from the multimedia and computer vision fields due to its enormous commercial value. Nevertheless, most existing image-based virtual try-on approaches directly combine the person-identity representation and the in-shop clothing items without taking their mutual correlations into consideration. Moreover, these methods are commonly established on pure convolutional neural networks (CNNs) architectures which are not simple to capture the long-range correlations among the input pixels. As a result, it generally results in inconsistent results. To alleviate these issues, in this article, we propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. During the first stage, we design a CIT matching block, aiming at precisely capturing the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. Consequently, it makes the warped in-shop clothing items look more natural in appearance. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask. The empirical results, based on mutual dependencies, demonstrate that the final try-on results are more realistic. Substantial empirical results on a public fashion dataset illustrate that the suggested CIT attains competitive virtual try-on performance.
spellingShingle Ren, B
Tang, H
Meng, F
Runwei, D
Torr, P
Sebe, N
Cloth interactive transformer for virtual try-on
title Cloth interactive transformer for virtual try-on
title_full Cloth interactive transformer for virtual try-on
title_fullStr Cloth interactive transformer for virtual try-on
title_full_unstemmed Cloth interactive transformer for virtual try-on
title_short Cloth interactive transformer for virtual try-on
title_sort cloth interactive transformer for virtual try on
work_keys_str_mv AT renb clothinteractivetransformerforvirtualtryon
AT tangh clothinteractivetransformerforvirtualtryon
AT mengf clothinteractivetransformerforvirtualtryon
AT runweid clothinteractivetransformerforvirtualtryon
AT torrp clothinteractivetransformerforvirtualtryon
AT seben clothinteractivetransformerforvirtualtryon