Turbo training with token dropout
The objective of this paper is an efficient training method for video tasks. We make three contributions: (1) We propose Turbo training, a simple and versatile training paradigm for Transformers on multiple video tasks. (2) We illustrate the advantages of Turbo training on action classification, vid...
Main Authors: | , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
British Machine Vision Association
2022
|
_version_ | 1797108452558897152 |
---|---|
author | Han, T Xie, W Zisserman, A |
author_facet | Han, T Xie, W Zisserman, A |
author_sort | Han, T |
collection | OXFORD |
description | The objective of this paper is an efficient training method for video tasks. We make three contributions: (1) We propose Turbo training, a simple and versatile training paradigm for Transformers on multiple video tasks. (2) We illustrate the advantages of Turbo training on action classification, video-language representation learning, and long-video activity classification, showing that Turbo training can largely maintain competitive performance while achieving almost 4× speed-up and significantly less memory consumption. (3) Turbo training enables long-schedule video-language training and end-to-end long-video training, delivering competitive or superior performance than previous works, which were infeasible to train under limited resources. |
first_indexed | 2024-03-07T07:27:58Z |
format | Conference item |
id | oxford-uuid:d1e1a0f7-9ecc-4617-bf0a-1abb56aafdb1 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T07:27:58Z |
publishDate | 2022 |
publisher | British Machine Vision Association |
record_format | dspace |
spelling | oxford-uuid:d1e1a0f7-9ecc-4617-bf0a-1abb56aafdb12022-12-02T10:45:01ZTurbo training with token dropoutConference itemhttp://purl.org/coar/resource_type/c_5794uuid:d1e1a0f7-9ecc-4617-bf0a-1abb56aafdb1EnglishSymplectic ElementsBritish Machine Vision Association2022Han, TXie, WZisserman, AThe objective of this paper is an efficient training method for video tasks. We make three contributions: (1) We propose Turbo training, a simple and versatile training paradigm for Transformers on multiple video tasks. (2) We illustrate the advantages of Turbo training on action classification, video-language representation learning, and long-video activity classification, showing that Turbo training can largely maintain competitive performance while achieving almost 4× speed-up and significantly less memory consumption. (3) Turbo training enables long-schedule video-language training and end-to-end long-video training, delivering competitive or superior performance than previous works, which were infeasible to train under limited resources. |
spellingShingle | Han, T Xie, W Zisserman, A Turbo training with token dropout |
title | Turbo training with token dropout |
title_full | Turbo training with token dropout |
title_fullStr | Turbo training with token dropout |
title_full_unstemmed | Turbo training with token dropout |
title_short | Turbo training with token dropout |
title_sort | turbo training with token dropout |
work_keys_str_mv | AT hant turbotrainingwithtokendropout AT xiew turbotrainingwithtokendropout AT zissermana turbotrainingwithtokendropout |