Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated according to this distribution. The model consists o...

Full description

Bibliographic Details
Main Authors:	Mao, Junhua, Xu, Wei, Yang, Yi, Wang, Jiang, Huang, Zhiheng, Yuille, Alan L.
Format:	Technical Report
Language:	en_US
Published:	Center for Brains, Minds and Machines (CBMM), arXiv 2015
Subjects:	multimodal Recurrent Neural Network (m-RNN) Artificial Intelligence Computer Language
Online Access:	http://hdl.handle.net/1721.1/100198

Internet

http://hdl.handle.net/1721.1/100198

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Internet

Similar Items