Summary: | This research project focuses on the concept of virtual humans and aims to enable natural language control of 3D avatars, allowing them to perform human-like movements that are coherent with their surrounding environment. To achieve this goal, the project proposes to learn a "conditional" human motion prior that takes into account scene information and/or language descriptions. This approach is intended to generate more coherent and meaningful human-like motions that are better suited to specific scenarios and objectives. The potential applications of this research include gaming, virtual avatars, and the Metaverse. The project employs a two-stage approach that combines language-conditioned human motion generation and physics-based character control techniques to generate diverse and physically plausible human motions in a physical world from language descriptions. The results of the project demonstrate the effectiveness of this approach in terms of motion diversity, faithfulness to the language description, and physical plausibility.
|