Large scale datasets for Image and Video Captioning in Italian

The application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such archit...

Full description

Bibliographic Details
Main Authors:	Scaiella Antonio, Danilo Croce, Roberto Basili
Format:	Article
Language:	English
Published:	Accademia University Press 2019-12-01
Series:	IJCoL
Online Access:	http://journals.openedition.org/ijcol/478

_version_	1819202257283973120
author	Scaiella Antonio Danilo Croce Roberto Basili
author_facet	Scaiella Antonio Danilo Croce Roberto Basili
author_sort	Scaiella Antonio
collection	DOAJ
description	The application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such architectures requires the availability of (possibly large-scale) language specific resources, which are not available for many languages, such as Italian.In this paper, we present MSCOCO-it e MSR-VTT-it, two large-scale resources for image and video captioning. They have been derived by applying automatic machine translation to existing resources. Even though this approach is naive and exposed to the gathering of noisy information (depending on the quality of the automatic translator), we experimentally show that robust deep learning is enabled, rather tolerant with respect to such noise. In particular, we improve the state-of-the-art results with respect to image captioning in Italian. Moreover, in the paper we discuss the training of a system that, at the best of our knowledge, is the first video captioning system in Italian.
first_indexed	2024-12-23T04:01:09Z
format	Article
id	doaj.art-5273e36a79404a07978ed1fcf57fc24a
institution	Directory Open Access Journal
issn	2499-4553
language	English
last_indexed	2024-12-23T04:01:09Z
publishDate	2019-12-01
publisher	Accademia University Press
record_format	Article
series	IJCoL
spelling	doaj.art-5273e36a79404a07978ed1fcf57fc24a2022-12-21T18:00:44ZengAccademia University PressIJCoL2499-45532019-12-0152496010.4000/ijcol.478Large scale datasets for Image and Video Captioning in ItalianScaiella AntonioDanilo CroceRoberto BasiliThe application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such architectures requires the availability of (possibly large-scale) language specific resources, which are not available for many languages, such as Italian.In this paper, we present MSCOCO-it e MSR-VTT-it, two large-scale resources for image and video captioning. They have been derived by applying automatic machine translation to existing resources. Even though this approach is naive and exposed to the gathering of noisy information (depending on the quality of the automatic translator), we experimentally show that robust deep learning is enabled, rather tolerant with respect to such noise. In particular, we improve the state-of-the-art results with respect to image captioning in Italian. Moreover, in the paper we discuss the training of a system that, at the best of our knowledge, is the first video captioning system in Italian.http://journals.openedition.org/ijcol/478
spellingShingle	Scaiella Antonio Danilo Croce Roberto Basili Large scale datasets for Image and Video Captioning in Italian IJCoL
title	Large scale datasets for Image and Video Captioning in Italian
title_full	Large scale datasets for Image and Video Captioning in Italian
title_fullStr	Large scale datasets for Image and Video Captioning in Italian
title_full_unstemmed	Large scale datasets for Image and Video Captioning in Italian
title_short	Large scale datasets for Image and Video Captioning in Italian
title_sort	large scale datasets for image and video captioning in italian
url	http://journals.openedition.org/ijcol/478
work_keys_str_mv	AT scaiellaantonio largescaledatasetsforimageandvideocaptioninginitalian AT danilocroce largescaledatasetsforimageandvideocaptioninginitalian AT robertobasili largescaledatasetsforimageandvideocaptioninginitalian

Large scale datasets for Image and Video Captioning in Italian

Similar Items