LN3Diff: scalable latent neural fields diffusion for speedy 3D generation
The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to...
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference Paper |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/180256 http://arxiv.org/abs/2403.12019v2 |
_version_ | 1826120620031606784 |
---|---|
author | Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change |
author2 | College of Computing and Data Science |
author_facet | College of Computing and Data Science Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change |
author_sort | Lan, Yushi |
collection | NTU |
description | The field of neural rendering has witnessed significant progress with
advancements in generative models and differentiable rendering techniques.
Though 2D diffusion has achieved success, a unified 3D diffusion pipeline
remains unsettled. This paper introduces a novel framework called LN3Diff to
address this gap and enable fast, high-quality, and generic conditional 3D
generation. Our approach harnesses a 3D-aware architecture and variational
autoencoder (VAE) to encode the input image into a structured, compact, and 3D
latent space. The latent is decoded by a transformer-based decoder into a
high-capacity 3D neural field. Through training a diffusion model on this
3D-aware latent space, our method achieves state-of-the-art performance on
ShapeNet for 3D generation and demonstrates superior performance in monocular
3D reconstruction and conditional 3D generation across various datasets.
Moreover, it surpasses existing 3D diffusion methods in terms of inference
speed, requiring no per-instance optimization. Our proposed LN3Diff presents a
significant advancement in 3D generative modeling and holds promise for various
applications in 3D vision and graphics tasks. |
first_indexed | 2024-10-01T05:19:24Z |
format | Conference Paper |
id | ntu-10356/180256 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2025-03-09T12:45:33Z |
publishDate | 2024 |
record_format | dspace |
spelling | ntu-10356/1802562024-10-01T05:53:14Z LN3Diff: scalable latent neural fields diffusion for speedy 3D generation Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change College of Computing and Data Science 2024 European Conference on Computer Vision (ECCV) S-Lab Computer and Information Science Generative model Reconstruction The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework called LN3Diff to address this gap and enable fast, high-quality, and generic conditional 3D generation. Our approach harnesses a 3D-aware architecture and variational autoencoder (VAE) to encode the input image into a structured, compact, and 3D latent space. The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field. Through training a diffusion model on this 3D-aware latent space, our method achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation across various datasets. Moreover, it surpasses existing 3D diffusion methods in terms of inference speed, requiring no per-instance optimization. Our proposed LN3Diff presents a significant advancement in 3D generative modeling and holds promise for various applications in 3D vision and graphics tasks. Ministry of Education (MOE) Submitted/Accepted version This study is supported under the RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contributions from the industry partner(s). It is also supported by Singapore MOE AcRF Tier 2 (MOE-T2EP20221-0011). 2024-09-26T05:10:50Z 2024-09-26T05:10:50Z 2024 Conference Paper Lan, Y., Hong, F., Yang, S., Zhou, S., Meng, X., Dai, B., Pan, X. & Loy, C. C. (2024). LN3Diff: scalable latent neural fields diffusion for speedy 3D generation. 2024 European Conference on Computer Vision (ECCV). https://dx.doi.org/10.48550/arXiv.2403.12019 https://hdl.handle.net/10356/180256 10.48550/arXiv.2403.12019 http://arxiv.org/abs/2403.12019v2 en doi:10.21979/N9/UZ06ZG © 2024 ECCV. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. application/pdf |
spellingShingle | Computer and Information Science Generative model Reconstruction Lan, Yushi Hong, Fangzhou Yang, Shuai Zhou, Shangchen Meng, Xuyi Dai, Bo Pan, Xingang Loy, Chen Change LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title | LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title_full | LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title_fullStr | LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title_full_unstemmed | LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title_short | LN3Diff: scalable latent neural fields diffusion for speedy 3D generation |
title_sort | ln3diff scalable latent neural fields diffusion for speedy 3d generation |
topic | Computer and Information Science Generative model Reconstruction |
url | https://hdl.handle.net/10356/180256 http://arxiv.org/abs/2403.12019v2 |
work_keys_str_mv | AT lanyushi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT hongfangzhou ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT yangshuai ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT zhoushangchen ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT mengxuyi ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT daibo ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT panxingang ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration AT loychenchange ln3diffscalablelatentneuralfieldsdiffusionforspeedy3dgeneration |