Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated according to this distribution. The model consists o...
Main Authors: | , , , , , |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM), arXiv
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/100198 |