Read and attend: temporal localisation in sign language videos

The objective of this work is to annotate sign instances across a broad vocabulary in continuous sign language. We train a Transformer model to ingest a continuous signing stream and output a sequence of written tokens on a large-scale collection of signing footage with weakly-aligned subtitles. We...

תיאור מלא

מידע ביבליוגרפי
Main Authors: Varol, G, Momeni, L, Albanie, S, Afouras, T, Zisserman, A
פורמט: Conference item
שפה:English
יצא לאור: IEEE 2021