LN3Diff: scalable latent neural fields diffusion for speedy 3D generation

The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to...

Full description

Bibliographic Details
Main Authors:	Lan, Yushi, Hong, Fangzhou, Yang, Shuai, Zhou, Shangchen, Meng, Xuyi, Dai, Bo, Pan, Xingang, Loy, Chen Change
Other Authors:	College of Computing and Data Science
Format:	Conference Paper
Language:	English
Published:	2024
Subjects:	Computer and Information Science Generative model Reconstruction
Online Access:	https://hdl.handle.net/10356/180256 http://arxiv.org/abs/2403.12019v2

_version_	1826120620031606784
author	Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change
author2	College of Computing and Data Science
author_facet	College of Computing and Data Science Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change
author_sort	Lan, Yushi
collection	NTU
description	The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation across various datasets. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Our proposed LN3Diff presents a significant advancement in 3D generative modeling and holds promise for various applications in 3D vision and graphics tasks.
first_indexed	2024-10-01T05:19:24Z
format	Conference Paper
id	ntu-10356/180256
institution	Nanyang Technological University
language	English
last_indexed	2025-03-09T12:45:33Z
publishDate	2024
record_format	dspace
spelling	ntu-10356/1802562024-10-01T05:53:14Z LN3Diff: scalable latent neural fields diffusion for speedy 3D generation Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change College of Computing and Data Science 2024 European Conference on Computer Vision (ECCV) S-Lab Computer and Information Science Generative model Reconstruction The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation across various datasets. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Our proposed LN3Diff presents a significant advancement in 3D generative modeling and holds promise for various applications in 3D vision and graphics tasks. Ministry of Education (MOE) Submitted/Accepted version This study is supported under the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contributions from the industry partner(s). It is also supported by Singapore MOE AcRF Tier 2 (MOE-T2EP20221-0011). 2024-09-26T05:10:50Z 2024-09-26T05:10:50Z 2024 Conference Paper Lan, Y., Hong, F., Yang, S., Zhou, S., Meng, X., Dai, B., Pan, X. & Loy, C. C. (2024). LN3Diff: scalable latent neural fields diffusion for speedy 3D generation. 2024 European Conference on Computer Vision (ECCV). https://dx.doi.org/10.48550/arXiv.2403.12019 https://hdl.handle.net/10356/180256 10.48550/arXiv.2403.12019 http://arxiv.org/abs/2403.12019v2 en doi:10.21979/N9/UZ06ZG © 2024 ECCV. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. application/pdf
spellingShingle	Computer and Information Science Generative model Reconstruction Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title	LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_full	LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_fullStr	LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_full_unstemmed	LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_short	LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_sort	ln3diff scalable latent neural fields diffusion for speedy 3d generation
topic	Computer and Information Science Generative model Reconstruction
url	https://hdl.handle.net/10356/180256 http://arxiv.org/abs/2403.12019v2
work_keys_str_mv	AT lanyushi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT hongfangzhou ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT yangshuai ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT zhoushangchen ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT mengxuyi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT daibo ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT panxingang ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT loychenchange ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration

LN3Diff: scalable latent neural fields diffusion for speedy 3D generation

Similar Items