Aligning source visual and target language domains for unpaired video captioning

Training supervised video captioning model requires coupled video-caption pairs. However, for many targeted languages, sufficient paired data are not available. To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language....

Volledige beschrijving

Bibliografische gegevens
Hoofdauteurs:	Liu, F, Wu, X, You, C, Ge, S, Zou, Y, Sun, X
Formaat:	Journal article
Taal:	English
Gepubliceerd in:	IEEE 2022

Aligning source visual and target language domains for unpaired video captioning

Gelijkaardige items