Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification

Multimodal deep learning, in the context of biometrics, encounters significant challenges due to the dependence on long speech utterances and RGB images, which are often impractical in certain situations. This paper presents a novel solution addressing these issues by leveraging ultrashort voice utt...

Full description

Bibliographic Details
Main Authors:	Abderrazzaq Moufidi, David Rousseau, Pejman Rasti
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	Sensors
Subjects:	depth images lip identification speaker identification late fusion multimodality spatiotemporal
Online Access:	https://www.mdpi.com/1424-8220/23/13/5890

Internet

https://www.mdpi.com/1424-8220/23/13/5890

Attention-Based Fusion of Ultrashort Voice Utterances and Depth Videos for Multimodal Person Identification

Internet

Similar Items