Natural-Language-Driven Multimodal Representation Learning for Audio-Visual Scene-Aware Dialog System

With the development of multimedia systems in wireless environments, the rising need for artificial intelligence is to design a system that can properly communicate with humans with a comprehensive understanding of various types of information in a human-like manner. Therefore, this paper addresses...

Full description

Bibliographic Details
Main Authors: Yoonseok Heo, Sangwoo Kang, Jungyun Seo
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/18/7875