Text-Free Audio Captions of Short Videos from Latent Space Representation

In this thesis, we re-implement previous work exploring image to speech captioning. We expand upon the work to implement video to speech captioning. Specifically, we implement a text-free image to speech captioning pipeline that integrates four distinct machine learning models. We alter the models t...

Full description

Bibliographic Details
Main Author:	Agarwal, Anisha
Other Authors:	Oliva, Aude
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/144873

Internet

https://hdl.handle.net/1721.1/144873

Text-Free Audio Captions of Short Videos from Latent Space Representation

Internet

Similar Items