Tiny Video Networks

Abstract Automatic video understanding is becoming more important for applications where real‐time performance is crucial and compute is limited: for example, automated video tagging, robot perception, activity recognition for mobile devices. Yet, accurate solutions so far have been computationally...

Full description

Bibliographic Details
Main Authors: A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo
Format: Article
Language:English
Published: Wiley 2022-02-01
Series:Applied AI Letters
Subjects:
Online Access:https://doi.org/10.1002/ail2.38
_version_ 1819018430700847104
author A. J. Piergiovanni
Anelia Angelova
Michael S. Ryoo
author_facet A. J. Piergiovanni
Anelia Angelova
Michael S. Ryoo
author_sort A. J. Piergiovanni
collection DOAJ
description Abstract Automatic video understanding is becoming more important for applications where real‐time performance is crucial and compute is limited: for example, automated video tagging, robot perception, activity recognition for mobile devices. Yet, accurate solutions so far have been computationally intensive. We propose efficient models for videos—Tiny Video Networks—which are video architectures, automatically designed to comply with fast runtimes and, at the same time are effective at video recognition tasks. The TVNs run at faster‐than‐real‐time speeds and demonstrate strong performance across several video benchmarks. These models not only provide new tools for real‐time video applications, but also enable fast research and development in video understanding. Code and models are available.
first_indexed 2024-12-21T03:19:18Z
format Article
id doaj.art-ddf5c0cc45f94ec78f1505e61ceda574
institution Directory Open Access Journal
issn 2689-5595
language English
last_indexed 2024-12-21T03:19:18Z
publishDate 2022-02-01
publisher Wiley
record_format Article
series Applied AI Letters
spelling doaj.art-ddf5c0cc45f94ec78f1505e61ceda5742022-12-21T19:17:44ZengWileyApplied AI Letters2689-55952022-02-0131n/an/a10.1002/ail2.38Tiny Video NetworksA. J. Piergiovanni0Anelia Angelova1Michael S. Ryoo2Google Research, Robotics at Google Mountain View California USAGoogle Research, Robotics at Google Mountain View California USAGoogle Research, Robotics at Google Mountain View California USAAbstract Automatic video understanding is becoming more important for applications where real‐time performance is crucial and compute is limited: for example, automated video tagging, robot perception, activity recognition for mobile devices. Yet, accurate solutions so far have been computationally intensive. We propose efficient models for videos—Tiny Video Networks—which are video architectures, automatically designed to comply with fast runtimes and, at the same time are effective at video recognition tasks. The TVNs run at faster‐than‐real‐time speeds and demonstrate strong performance across several video benchmarks. These models not only provide new tools for real‐time video applications, but also enable fast research and development in video understanding. Code and models are available.https://doi.org/10.1002/ail2.38efficient video modelsvideo architecture searchvideo understanding
spellingShingle A. J. Piergiovanni
Anelia Angelova
Michael S. Ryoo
Tiny Video Networks
Applied AI Letters
efficient video models
video architecture search
video understanding
title Tiny Video Networks
title_full Tiny Video Networks
title_fullStr Tiny Video Networks
title_full_unstemmed Tiny Video Networks
title_short Tiny Video Networks
title_sort tiny video networks
topic efficient video models
video architecture search
video understanding
url https://doi.org/10.1002/ail2.38
work_keys_str_mv AT ajpiergiovanni tinyvideonetworks
AT aneliaangelova tinyvideonetworks
AT michaelsryoo tinyvideonetworks