Text-based image retrieval using image captioning

Photo taking with smartphones has become a habitual act of the new generation of people. The newer smartphone models have huge storage capacity that we no longer need to organize the photo gallery. Each smartphone user could easily have a thousand or more photos accumulated throughout the years in t...

Full description

Bibliographic Details
Main Author: Tan, Kah Hwa
Other Authors: Yap Kim Hui
Format: Final Year Project (FYP)
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/78003
_version_ 1826127753194242048
author Tan, Kah Hwa
author2 Yap Kim Hui
author_facet Yap Kim Hui
Tan, Kah Hwa
author_sort Tan, Kah Hwa
collection NTU
description Photo taking with smartphones has become a habitual act of the new generation of people. The newer smartphone models have huge storage capacity that we no longer need to organize the photo gallery. Each smartphone user could easily have a thousand or more photos accumulated throughout the years in their gallery which makes finding a photo a daunting task. This project aimed to develop a textual visual search for images utilizing Image Captioning. A thorough literature review was conducted to understand the latest techniques used in Image Captioning. A few comparisons were made before selecting the technique that is most reasonable and feasible to do. The model was trained, and evaluation was performed on well-known metrics to prove its feasibility. Next, A web-based photo gallery application was created with Django using the trained Image Captioning model as the backbone to realize the visual retrieval capability. The retrieval through the web application was develop with a robust searching function to handle human errors. The web application is also integrated with different recognition models to improve the relevancy of the images retrieved. The report contains the experimental results, the steps to develop the web application, how the integration is done between the graphical user interface (GUI) of the web application and Image Captioning model, the difficulties encountered while performing the tasks and a comparison of different searching function to retrieve relevant images. It concludes by discussing the potential of image captioning in performing visual retrieval task, the current limitations and possible future works.
first_indexed 2024-10-01T07:13:40Z
format Final Year Project (FYP)
id ntu-10356/78003
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:13:40Z
publishDate 2019
record_format dspace
spelling ntu-10356/780032023-07-07T16:31:00Z Text-based image retrieval using image captioning Tan, Kah Hwa Yap Kim Hui School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Photo taking with smartphones has become a habitual act of the new generation of people. The newer smartphone models have huge storage capacity that we no longer need to organize the photo gallery. Each smartphone user could easily have a thousand or more photos accumulated throughout the years in their gallery which makes finding a photo a daunting task. This project aimed to develop a textual visual search for images utilizing Image Captioning. A thorough literature review was conducted to understand the latest techniques used in Image Captioning. A few comparisons were made before selecting the technique that is most reasonable and feasible to do. The model was trained, and evaluation was performed on well-known metrics to prove its feasibility. Next, A web-based photo gallery application was created with Django using the trained Image Captioning model as the backbone to realize the visual retrieval capability. The retrieval through the web application was develop with a robust searching function to handle human errors. The web application is also integrated with different recognition models to improve the relevancy of the images retrieved. The report contains the experimental results, the steps to develop the web application, how the integration is done between the graphical user interface (GUI) of the web application and Image Captioning model, the difficulties encountered while performing the tasks and a comparison of different searching function to retrieve relevant images. It concludes by discussing the potential of image captioning in performing visual retrieval task, the current limitations and possible future works. Bachelor of Engineering (Electrical and Electronic Engineering) 2019-06-11T02:02:55Z 2019-06-11T02:02:55Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/78003 en Nanyang Technological University 68 p. application/pdf
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Tan, Kah Hwa
Text-based image retrieval using image captioning
title Text-based image retrieval using image captioning
title_full Text-based image retrieval using image captioning
title_fullStr Text-based image retrieval using image captioning
title_full_unstemmed Text-based image retrieval using image captioning
title_short Text-based image retrieval using image captioning
title_sort text based image retrieval using image captioning
topic DRNTU::Engineering::Electrical and electronic engineering
url http://hdl.handle.net/10356/78003
work_keys_str_mv AT tankahhwa textbasedimageretrievalusingimagecaptioning