LN3Diff: scalable latent neural fields diffusion for speedy 3D generation

The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to...

Full description

Bibliographic Details
Main Authors: Lan, Yushi, Hong, Fangzhou, Yang, Shuai, Zhou, Shangchen, Meng, Xuyi, Dai, Bo, Pan, Xingang, Loy, Chen Change
Other Authors: College of Computing and Data Science
Format: Conference Paper
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/180256
http://arxiv.org/abs/2403.12019v2
_version_ 1826120620031606784
author Lan, Yushi
Hong, Fangzhou
Yang, Shuai
Zhou, Shangchen
Meng, Xuyi
Dai, Bo
Pan, Xingang
Loy, Chen Change
author2 College of Computing and Data Science
author_facet College of Computing and Data Science
Lan, Yushi
Hong, Fangzhou
Yang, Shuai
Zhou, Shangchen
Meng, Xuyi
Dai, Bo
Pan, Xingang
Loy, Chen Change
author_sort Lan, Yushi
collection NTU
description The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation across various datasets. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Our proposed LN3Diff presents a significant advancement in 3D generative modeling and holds promise for various applications in 3D vision and graphics tasks.
first_indexed 2024-10-01T05:19:24Z
format Conference Paper
id ntu-10356/180256
institution Nanyang Technological University
language English
last_indexed 2025-03-09T12:45:33Z
publishDate 2024
record_format dspace
spelling ntu-10356/1802562024-10-01T05:53:14Z LN3Diff: scalable latent neural fields diffusion for speedy 3D generation Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change College of Computing and Data Science 2024 European Conference on Computer Vision (ECCV) S-Lab Computer and Information Science Generative model Reconstruction The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation across various datasets. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Our proposed LN3Diff presents a significant advancement in 3D generative modeling and holds promise for various applications in 3D vision and graphics tasks. Ministry of Education (MOE) Submitted/Accepted version This study is supported under the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contributions from the industry partner(s). It is also supported by Singapore MOE AcRF Tier 2 (MOE-T2EP20221-0011). 2024-09-26T05:10:50Z 2024-09-26T05:10:50Z 2024 Conference Paper Lan, Y., Hong, F., Yang, S., Zhou, S., Meng, X., Dai, B., Pan, X. & Loy, C. C. (2024). LN3Diff: scalable latent neural fields diffusion for speedy 3D generation. 2024 European Conference on Computer Vision (ECCV). https://dx.doi.org/10.48550/arXiv.2403.12019 https://hdl.handle.net/10356/180256 10.48550/arXiv.2403.12019 http://arxiv.org/abs/2403.12019v2 en doi:10.21979/N9/UZ06ZG © 2024 ECCV. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. application/pdf
spellingShingle Computer and Information Science
Generative model
Reconstruction
Lan, Yushi
Hong, Fangzhou
Yang, Shuai
Zhou, Shangchen
Meng, Xuyi
Dai, Bo
Pan, Xingang
Loy, Chen Change
LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_full LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_fullStr LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_full_unstemmed LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_short LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
title_sort ln3diff scalable latent neural fields diffusion for speedy 3d generation
topic Computer and Information Science
Generative model
Reconstruction
url https://hdl.handle.net/10356/180256
http://arxiv.org/abs/2403.12019v2
work_keys_str_mv AT lanyushi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT hongfangzhou ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT yangshuai ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT zhoushangchen ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT mengxuyi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT daibo ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT panxingang ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration
AT loychenchange ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration