Learning Aligned Cross-Modal Representations from Weakly Aligned Data

People can recognize scenes across many different modalities beyond natural images. In this paper, we investigate how to learn cross-modal scene representations that transfer across modalities. To study this problem, we introduce a new cross-modal scene dataset. While convolutional neural networks c...

Fuld beskrivelse

Bibliografiske detaljer
Main Authors: Castrejon, Lluis, Pirsiavash, Hamed, Aytar, Yusuf, Vondrick, Carl Martin, Torralba, Antonio
Andre forfattere: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Sprog:en_US
Udgivet: Institute of Electrical and Electronics Engineers (IEEE) 2017
Online adgang:http://hdl.handle.net/1721.1/112989
https://orcid.org/0000-0003-1631-4525
https://orcid.org/0000-0001-5676-2387
https://orcid.org/0000-0003-4915-0256