Aligning source visual and target language domains for unpaired video captioning

Training supervised video captioning model requires coupled video-caption pairs. However, for many targeted languages, sufficient paired data are not available. To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language....

Full description

Bibliographic Details
Main Authors: Liu, F, Wu, X, You, C, Ge, S, Zou, Y, Sun, X
Format: Journal article
Language:English
Published: IEEE 2022