Everybody's talkin': let me talk as you want

We present a method to edit a target portrait footage by taking a sequence of audio as input to synthesize a photo-realistic video. This method is unique because it is highly dynamic. It does not assume a person-specific rendering network yet capable of translating one source audio into one random c...

詳細記述

書誌詳細
主要な著者: Song, Linsen, Wu, Wayne, Qian, Chen, He, Ran, Loy, Chen Change
その他の著者: School of Computer Science and Engineering
フォーマット: Journal Article
言語:English
出版事項: 2022
主題:
オンライン・アクセス:https://hdl.handle.net/10356/162986