COMIC: Toward A Compact Image Captioning Model With Attention

Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to deploy on embedded systems with limited hardware resources. This...

Full description

Bibliographic Details
Main Authors:	Tan, Jia Huei, Chan, Chee Seng, Chuah, Joon Huang
Format:	Article
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2019
Subjects:	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering

_version_	1825722035562610688
author	Tan, Jia Huei Chan, Chee Seng Chuah, Joon Huang
author_facet	Tan, Jia Huei Chan, Chee Seng Chuah, Joon Huang
author_sort	Tan, Jia Huei
collection	UM
description	Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to deploy on embedded systems with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of vocabulary, adversely affecting the compactness of these networks. To address this limitation, this paper introduces a brand new idea in the domain of image captioning. That is, we tackle the problem of compactness of image captioning models which is hitherto unexplored. We showed that our proposed model, named COMIC for compact image captioning, achieves comparable results in five common evaluation metrics with state-of-the-art approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an embedded vocabulary size that is 39×-99× smaller. © 1999-2012 IEEE.
first_indexed	2024-03-06T05:59:29Z
format	Article
id	um.eprints-23306
institution	Universiti Malaya
last_indexed	2024-03-06T05:59:29Z
publishDate	2019
publisher	Institute of Electrical and Electronics Engineers (IEEE)
record_format	dspace
spelling	um.eprints-233062020-01-06T01:50:56Z http://eprints.um.edu.my/23306/ COMIC: Toward A Compact Image Captioning Model With Attention Tan, Jia Huei Chan, Chee Seng Chuah, Joon Huang QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to deploy on embedded systems with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of vocabulary, adversely affecting the compactness of these networks. To address this limitation, this paper introduces a brand new idea in the domain of image captioning. That is, we tackle the problem of compactness of image captioning models which is hitherto unexplored. We showed that our proposed model, named COMIC for compact image captioning, achieves comparable results in five common evaluation metrics with state-of-the-art approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an embedded vocabulary size that is 39×-99× smaller. © 1999-2012 IEEE. Institute of Electrical and Electronics Engineers (IEEE) 2019 Article PeerReviewed Tan, Jia Huei and Chan, Chee Seng and Chuah, Joon Huang (2019) COMIC: Toward A Compact Image Captioning Model With Attention. IEEE Transactions on Multimedia, 21 (10). pp. 2686-2696. ISSN 1520-9210, DOI https://doi.org/10.1109/TMM.2019.2904878 <https://doi.org/10.1109/TMM.2019.2904878>. https://doi.org/10.1109/TMM.2019.2904878 doi:10.1109/TMM.2019.2904878
spellingShingle	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering Tan, Jia Huei Chan, Chee Seng Chuah, Joon Huang COMIC: Toward A Compact Image Captioning Model With Attention
title	COMIC: Toward A Compact Image Captioning Model With Attention
title_full	COMIC: Toward A Compact Image Captioning Model With Attention
title_fullStr	COMIC: Toward A Compact Image Captioning Model With Attention
title_full_unstemmed	COMIC: Toward A Compact Image Captioning Model With Attention
title_short	COMIC: Toward A Compact Image Captioning Model With Attention
title_sort	comic toward a compact image captioning model with attention
topic	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering
work_keys_str_mv	AT tanjiahuei comictowardacompactimagecaptioningmodelwithattention AT chancheeseng comictowardacompactimagecaptioningmodelwithattention AT chuahjoonhuang comictowardacompactimagecaptioningmodelwithattention

COMIC: Toward A Compact Image Captioning Model With Attention

Similar Items