Deep Gated Recurrent Unit for Smartphone-Based Image Captioning

Expressing the visual content of an image in natural language form has gained relevance due to technological and algorithmic advances together with improved computational processing capacity. Many smartphone applications for image captioning have been developed recently as built-in cameras provide a...

Full description

Bibliographic Details
Main Author: Volkan Kılıç
Format: Article
Language:English
Published: Sakarya University 2021-08-01
Series:Sakarya University Journal of Computer and Information Sciences
Subjects:
Online Access:https://dergipark.org.tr/tr/download/article-file/1526648
_version_ 1797351867417624576
author Volkan Kılıç
author_facet Volkan Kılıç
author_sort Volkan Kılıç
collection DOAJ
description Expressing the visual content of an image in natural language form has gained relevance due to technological and algorithmic advances together with improved computational processing capacity. Many smartphone applications for image captioning have been developed recently as built-in cameras provide advantages of easy-operation and portability, resulting in capturing an image whenever or wherever needed. Here, an encoder-decoder framework based new image captioning approach with a multi-layer gated recurrent unit is proposed. The Inception-v3 convolutional neural network is employed in the encoder due to its capability of more feature extraction from small regions. The proposed recurrent neural network-based decoder utilizes these features in the multi-layer gated recurrent unit to produce a natural language expression word-by-word. Experimental evaluations on the MSCOCO dataset demonstrate that our proposed approach has the advantage over existing approaches consistently across different evaluation metrics. With the integration of the proposed approach to our custom-designed Android application, named “VirtualEye+”, it has great potential to implement image captioning in daily routine.
first_indexed 2024-03-08T13:06:45Z
format Article
id doaj.art-fa82798f98b649529132c36eec6183a8
institution Directory Open Access Journal
issn 2636-8129
language English
last_indexed 2024-03-08T13:06:45Z
publishDate 2021-08-01
publisher Sakarya University
record_format Article
series Sakarya University Journal of Computer and Information Sciences
spelling doaj.art-fa82798f98b649529132c36eec6183a82024-01-18T16:44:37ZengSakarya UniversitySakarya University Journal of Computer and Information Sciences2636-81292021-08-014218119110.35377/saucis.04.02.86640928Deep Gated Recurrent Unit for Smartphone-Based Image CaptioningVolkan Kılıç0IZMIR KATIP CELEBI UNIVERSITYExpressing the visual content of an image in natural language form has gained relevance due to technological and algorithmic advances together with improved computational processing capacity. Many smartphone applications for image captioning have been developed recently as built-in cameras provide advantages of easy-operation and portability, resulting in capturing an image whenever or wherever needed. Here, an encoder-decoder framework based new image captioning approach with a multi-layer gated recurrent unit is proposed. The Inception-v3 convolutional neural network is employed in the encoder due to its capability of more feature extraction from small regions. The proposed recurrent neural network-based decoder utilizes these features in the multi-layer gated recurrent unit to produce a natural language expression word-by-word. Experimental evaluations on the MSCOCO dataset demonstrate that our proposed approach has the advantage over existing approaches consistently across different evaluation metrics. With the integration of the proposed approach to our custom-designed Android application, named “VirtualEye+”, it has great potential to implement image captioning in daily routine.https://dergipark.org.tr/tr/download/article-file/1526648artificial intelligencenatural language processingimage captioningandroid
spellingShingle Volkan Kılıç
Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
Sakarya University Journal of Computer and Information Sciences
artificial intelligence
natural language processing
image captioning
android
title Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
title_full Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
title_fullStr Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
title_full_unstemmed Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
title_short Deep Gated Recurrent Unit for Smartphone-Based Image Captioning
title_sort deep gated recurrent unit for smartphone based image captioning
topic artificial intelligence
natural language processing
image captioning
android
url https://dergipark.org.tr/tr/download/article-file/1526648
work_keys_str_mv AT volkankılıc deepgatedrecurrentunitforsmartphonebasedimagecaptioning