Aligning source visual and target language domains for unpaired video captioning

Training supervised video captioning model requires coupled video-caption pairs. However, for many targeted languages, sufficient paired data are not available. To this end, we introduce the unpaired video captioning task aiming to train models without coupled video-caption pairs in target language....

Volledige beschrijving

Bibliografische gegevens
Hoofdauteurs: Liu, F, Wu, X, You, C, Ge, S, Zou, Y, Sun, X
Formaat: Journal article
Taal:English
Gepubliceerd in: IEEE 2022