Read and attend: temporal localisation in sign language videos
The objective of this work is to annotate sign instances across a broad vocabulary in continuous sign language. We train a Transformer model to ingest a continuous signing stream and output a sequence of written tokens on a large-scale collection of signing footage with weakly-aligned subtitles. We...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
IEEE
2021
|