Video summarisation by deep visual and categorical diversity

The authors propose a video‐summarisation method based on visual and categorical diversities using pre‐trained deep visual and categorical models. Their method extracts visual and categorical features from a pre‐trained deep convolutional network (DCN) and a pre‐trained word‐embedding matrix. Using...

Full description

Bibliographic Details
Main Authors: Pedro Atencio, Sánchez‐Torres German, John William Branch, Claudio Delrieux
Format: Article
Language:English
Published: Wiley 2019-09-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/iet-cvi.2018.5436
_version_ 1797684719301689344
author Pedro Atencio
Sánchez‐Torres German
John William Branch
Claudio Delrieux
author_facet Pedro Atencio
Sánchez‐Torres German
John William Branch
Claudio Delrieux
author_sort Pedro Atencio
collection DOAJ
description The authors propose a video‐summarisation method based on visual and categorical diversities using pre‐trained deep visual and categorical models. Their method extracts visual and categorical features from a pre‐trained deep convolutional network (DCN) and a pre‐trained word‐embedding matrix. Using visual and categorical information they obtain a video diversity estimation, which is used as an importance score to select segments from the input video that best describes it. Their method also allows performing queries during the search process, in this way personalising the resulting video summaries according to the particular intended purposes. The performance of the method is evaluated using different pre‐trained DCN models in order to select the architecture with the best throughput. They then compare it with other state‐of‐the‐art proposals in video summarisation using a data‐driven approach with the public dataset SumMe, which contains annotated videos with per‐fragment importance. The results show that their method outperforms other proposals in most of the examples. As an additional advantage, their method requires a simple and direct implementation that does not require a training stage.
first_indexed 2024-03-12T00:33:49Z
format Article
id doaj.art-0e22299bedda42bd8670fee74c1f8a11
institution Directory Open Access Journal
issn 1751-9632
1751-9640
language English
last_indexed 2024-03-12T00:33:49Z
publishDate 2019-09-01
publisher Wiley
record_format Article
series IET Computer Vision
spelling doaj.art-0e22299bedda42bd8670fee74c1f8a112023-09-15T10:01:28ZengWileyIET Computer Vision1751-96321751-96402019-09-0113656957710.1049/iet-cvi.2018.5436Video summarisation by deep visual and categorical diversityPedro Atencio0Sánchez‐Torres German1John William Branch2Claudio Delrieux3Faculty of EngineeringInstituto Tecnológico MetropolitanoMedellinColombiaFaculty of EngineeringUniversidad del MagdalenaSanta MartaColombiaFaculty of MinesUniversidad Nacional de ColombiaMedellinColombiaElectric and Computing Engineering DepartmentUniversidad Nacional del SurBahia BlancaArgentinaThe authors propose a video‐summarisation method based on visual and categorical diversities using pre‐trained deep visual and categorical models. Their method extracts visual and categorical features from a pre‐trained deep convolutional network (DCN) and a pre‐trained word‐embedding matrix. Using visual and categorical information they obtain a video diversity estimation, which is used as an importance score to select segments from the input video that best describes it. Their method also allows performing queries during the search process, in this way personalising the resulting video summaries according to the particular intended purposes. The performance of the method is evaluated using different pre‐trained DCN models in order to select the architecture with the best throughput. They then compare it with other state‐of‐the‐art proposals in video summarisation using a data‐driven approach with the public dataset SumMe, which contains annotated videos with per‐fragment importance. The results show that their method outperforms other proposals in most of the examples. As an additional advantage, their method requires a simple and direct implementation that does not require a training stage.https://doi.org/10.1049/iet-cvi.2018.5436video summarisationdeep visual diversitiescategorical diversitiesvideo-summarisation methodpre-trained deep visual modelscategorical models
spellingShingle Pedro Atencio
Sánchez‐Torres German
John William Branch
Claudio Delrieux
Video summarisation by deep visual and categorical diversity
IET Computer Vision
video summarisation
deep visual diversities
categorical diversities
video-summarisation method
pre-trained deep visual models
categorical models
title Video summarisation by deep visual and categorical diversity
title_full Video summarisation by deep visual and categorical diversity
title_fullStr Video summarisation by deep visual and categorical diversity
title_full_unstemmed Video summarisation by deep visual and categorical diversity
title_short Video summarisation by deep visual and categorical diversity
title_sort video summarisation by deep visual and categorical diversity
topic video summarisation
deep visual diversities
categorical diversities
video-summarisation method
pre-trained deep visual models
categorical models
url https://doi.org/10.1049/iet-cvi.2018.5436
work_keys_str_mv AT pedroatencio videosummarisationbydeepvisualandcategoricaldiversity
AT sancheztorresgerman videosummarisationbydeepvisualandcategoricaldiversity
AT johnwilliambranch videosummarisationbydeepvisualandcategoricaldiversity
AT claudiodelrieux videosummarisationbydeepvisualandcategoricaldiversity