GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing

We propose GaussCtrl, a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS). Our method first renders a collection of images by using the 3DGS and edits them by using a pre-trained 2D diffusion model (ControlNet) based on the input prompt, which is then used to op...

Deskribapen osoa

Xehetasun bibliografikoak
Egile Nagusiak: Wu, J, Bian, J-W, Li, X, Wang, G, Reid, I, Torr, P, Prisacariu, VA
Formatua: Conference item
Hizkuntza:English
Argitaratua: Springer 2024
_version_ 1826317433945718784
author Wu, J
Bian, J-W
Li, X
Wang, G
Reid, I
Torr, P
Prisacariu, VA
author_facet Wu, J
Bian, J-W
Li, X
Wang, G
Reid, I
Torr, P
Prisacariu, VA
author_sort Wu, J
collection OXFORD
description We propose GaussCtrl, a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS). Our method first renders a collection of images by using the 3DGS and edits them by using a pre-trained 2D diffusion model (ControlNet) based on the input prompt, which is then used to optimise the 3D model. Our key contribution is multi-view consistent editing, which enables editing all images together instead of iteratively editing one image while updating the 3D model as in previous works. It leads to faster editing as well as higher visual quality. This is achieved by the two terms: (a) depth-conditioned editing that enforces geometric consistency across multi-view images by leveraging naturally consistent depth maps. (b) attention-based latent code alignment that unifies the appearance of edited images by conditioning their editing to several reference views through self and cross-view attention between images’ latent representations. Experiments demonstrate that our method achieves faster editing and better visual results than previous state-of-the-art methods. Project website: https://gaussctrl.active.vision/
first_indexed 2024-09-25T04:09:19Z
format Conference item
id oxford-uuid:b0c2cdb6-b48c-4b69-b149-99aa95ae949a
institution University of Oxford
language English
last_indexed 2025-02-19T04:38:20Z
publishDate 2024
publisher Springer
record_format dspace
spelling oxford-uuid:b0c2cdb6-b48c-4b69-b149-99aa95ae949a2025-02-11T14:36:21ZGaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editingConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b0c2cdb6-b48c-4b69-b149-99aa95ae949aEnglishSymplectic ElementsSpringer2024Wu, JBian, J-WLi, XWang, GReid, ITorr, PPrisacariu, VAWe propose GaussCtrl, a text-driven method to edit a 3D scene reconstructed by the 3D Gaussian Splatting (3DGS). Our method first renders a collection of images by using the 3DGS and edits them by using a pre-trained 2D diffusion model (ControlNet) based on the input prompt, which is then used to optimise the 3D model. Our key contribution is multi-view consistent editing, which enables editing all images together instead of iteratively editing one image while updating the 3D model as in previous works. It leads to faster editing as well as higher visual quality. This is achieved by the two terms: (a) depth-conditioned editing that enforces geometric consistency across multi-view images by leveraging naturally consistent depth maps. (b) attention-based latent code alignment that unifies the appearance of edited images by conditioning their editing to several reference views through self and cross-view attention between images’ latent representations. Experiments demonstrate that our method achieves faster editing and better visual results than previous state-of-the-art methods. Project website: https://gaussctrl.active.vision/
spellingShingle Wu, J
Bian, J-W
Li, X
Wang, G
Reid, I
Torr, P
Prisacariu, VA
GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title_full GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title_fullStr GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title_full_unstemmed GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title_short GaussCtrl: multi-view consistent text-driven 3D Gaussian splatting editing
title_sort gaussctrl multi view consistent text driven 3d gaussian splatting editing
work_keys_str_mv AT wuj gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT bianjw gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT lix gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT wangg gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT reidi gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT torrp gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting
AT prisacariuva gaussctrlmultiviewconsistenttextdriven3dgaussiansplattingediting