4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields

We present an efficient approach for monocular 4D facial avatar reconstruction using a dynamic neural radiance field (NeRF). Over the years, NeRFs have been popular methods for 3D scene representation, but lack computational efficiency and controllabilty, thus it is impractical for real world applic...

Full description

Bibliographic Details
Main Authors: Jeong-Gi Kwak, Hanseok Ko
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10401911/
_version_ 1797335520962936832
author Jeong-Gi Kwak
Hanseok Ko
author_facet Jeong-Gi Kwak
Hanseok Ko
author_sort Jeong-Gi Kwak
collection DOAJ
description We present an efficient approach for monocular 4D facial avatar reconstruction using a dynamic neural radiance field (NeRF). Over the years, NeRFs have been popular methods for 3D scene representation, but lack computational efficiency and controllabilty, thus it is impractical for real world application such as AR/VR, teleconferencing, and immersive experiences. Recent the introduction of grid-based encoding by InstantNGP has enabled the rendering process of NeRF much faster, but it is limited to static 3D scenes. To address the issues, we focus on developing a novel dynamic NeRF that allows explicit control over pose and facial expression, while keeping the computational efficiency. By leveraging a low-dimensional basis from the morphable model (3DMM) with elaborately designed spatial encoding branch and ambient encoding branch, we condition a dynamic radiance field in an ambient space, improving controllability and visual quality. Our model achieves rendering speeds approximately 30x faster at training and 100x faster at inference than the baseline (NeRFace), enabling practical approaches for real world applications. Through qualitative and quantitative experiments, we demonstrate the effectiveness of our approach. The dynamic NeRF exhibits superior controllability, enhanced 3D consistency, and improved visual quality. Our efficient model opens new possibilities for real-time applications, revolutionizing AR/VR and teleconferencing experiences.
first_indexed 2024-03-08T08:39:28Z
format Article
id doaj.art-99420d863c17409b81463f156530fc77
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-08T08:39:28Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-99420d863c17409b81463f156530fc772024-02-02T00:03:17ZengIEEEIEEE Access2169-35362024-01-0112156751568310.1109/ACCESS.2024.3355052104019114D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance FieldsJeong-Gi Kwak0Hanseok Ko1https://orcid.org/0000-0002-8744-4514School of Electrical Engineering, Korea University, Seoul, South KoreaSchool of Electrical Engineering, Korea University, Seoul, South KoreaWe present an efficient approach for monocular 4D facial avatar reconstruction using a dynamic neural radiance field (NeRF). Over the years, NeRFs have been popular methods for 3D scene representation, but lack computational efficiency and controllabilty, thus it is impractical for real world application such as AR/VR, teleconferencing, and immersive experiences. Recent the introduction of grid-based encoding by InstantNGP has enabled the rendering process of NeRF much faster, but it is limited to static 3D scenes. To address the issues, we focus on developing a novel dynamic NeRF that allows explicit control over pose and facial expression, while keeping the computational efficiency. By leveraging a low-dimensional basis from the morphable model (3DMM) with elaborately designed spatial encoding branch and ambient encoding branch, we condition a dynamic radiance field in an ambient space, improving controllability and visual quality. Our model achieves rendering speeds approximately 30x faster at training and 100x faster at inference than the baseline (NeRFace), enabling practical approaches for real world applications. Through qualitative and quantitative experiments, we demonstrate the effectiveness of our approach. The dynamic NeRF exhibits superior controllability, enhanced 3D consistency, and improved visual quality. Our efficient model opens new possibilities for real-time applications, revolutionizing AR/VR and teleconferencing experiences.https://ieeexplore.ieee.org/document/10401911/Neural radiance field (NeRF)monocular facial avatar reconstructionface reenactment
spellingShingle Jeong-Gi Kwak
Hanseok Ko
4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
IEEE Access
Neural radiance field (NeRF)
monocular facial avatar reconstruction
face reenactment
title 4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
title_full 4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
title_fullStr 4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
title_full_unstemmed 4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
title_short 4D Facial Avatar Reconstruction From Monocular Video via Efficient and Controllable Neural Radiance Fields
title_sort 4d facial avatar reconstruction from monocular video via efficient and controllable neural radiance fields
topic Neural radiance field (NeRF)
monocular facial avatar reconstruction
face reenactment
url https://ieeexplore.ieee.org/document/10401911/
work_keys_str_mv AT jeonggikwak 4dfacialavatarreconstructionfrommonocularvideoviaefficientandcontrollableneuralradiancefields
AT hanseokko 4dfacialavatarreconstructionfrommonocularvideoviaefficientandcontrollableneuralradiancefields