Large scale datasets for Image and Video Captioning in Italian
The application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such archit...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Accademia University Press
2019-12-01
|
Series: | IJCoL |
Online Access: | http://journals.openedition.org/ijcol/478 |
_version_ | 1819202257283973120 |
---|---|
author | Scaiella Antonio Danilo Croce Roberto Basili |
author_facet | Scaiella Antonio Danilo Croce Roberto Basili |
author_sort | Scaiella Antonio |
collection | DOAJ |
description | The application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such architectures requires the availability of (possibly large-scale) language specific resources, which are not available for many languages, such as Italian.In this paper, we present MSCOCO-it e MSR-VTT-it, two large-scale resources for image and video captioning. They have been derived by applying automatic machine translation to existing resources. Even though this approach is naive and exposed to the gathering of noisy information (depending on the quality of the automatic translator), we experimentally show that robust deep learning is enabled, rather tolerant with respect to such noise. In particular, we improve the state-of-the-art results with respect to image captioning in Italian. Moreover, in the paper we discuss the training of a system that, at the best of our knowledge, is the first video captioning system in Italian. |
first_indexed | 2024-12-23T04:01:09Z |
format | Article |
id | doaj.art-5273e36a79404a07978ed1fcf57fc24a |
institution | Directory Open Access Journal |
issn | 2499-4553 |
language | English |
last_indexed | 2024-12-23T04:01:09Z |
publishDate | 2019-12-01 |
publisher | Accademia University Press |
record_format | Article |
series | IJCoL |
spelling | doaj.art-5273e36a79404a07978ed1fcf57fc24a2022-12-21T18:00:44ZengAccademia University PressIJCoL2499-45532019-12-0152496010.4000/ijcol.478Large scale datasets for Image and Video Captioning in ItalianScaiella AntonioDanilo CroceRoberto BasiliThe application of Attention-based Deep Neural architectures to the automatic captioning of images and videos is enabling the development of increasingly performing systems. Unfortunately, while image processing is language independent, this does not hold for caption generation. Training such architectures requires the availability of (possibly large-scale) language specific resources, which are not available for many languages, such as Italian.In this paper, we present MSCOCO-it e MSR-VTT-it, two large-scale resources for image and video captioning. They have been derived by applying automatic machine translation to existing resources. Even though this approach is naive and exposed to the gathering of noisy information (depending on the quality of the automatic translator), we experimentally show that robust deep learning is enabled, rather tolerant with respect to such noise. In particular, we improve the state-of-the-art results with respect to image captioning in Italian. Moreover, in the paper we discuss the training of a system that, at the best of our knowledge, is the first video captioning system in Italian.http://journals.openedition.org/ijcol/478 |
spellingShingle | Scaiella Antonio Danilo Croce Roberto Basili Large scale datasets for Image and Video Captioning in Italian IJCoL |
title | Large scale datasets for Image and Video Captioning in Italian |
title_full | Large scale datasets for Image and Video Captioning in Italian |
title_fullStr | Large scale datasets for Image and Video Captioning in Italian |
title_full_unstemmed | Large scale datasets for Image and Video Captioning in Italian |
title_short | Large scale datasets for Image and Video Captioning in Italian |
title_sort | large scale datasets for image and video captioning in italian |
url | http://journals.openedition.org/ijcol/478 |
work_keys_str_mv | AT scaiellaantonio largescaledatasetsforimageandvideocaptioninginitalian AT danilocroce largescaledatasetsforimageandvideocaptioninginitalian AT robertobasili largescaledatasetsforimageandvideocaptioninginitalian |