Self-supervised video representation learning by uncovering spatio-temporal statistics
This paper proposes a novel pretext task to address the self-supervised video representation learning problem. Specifically, given an unlabeled video clip, we compute a series of spatio-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion, the spa...
Huvudupphovsmän: | , , , , , |
---|---|
Materialtyp: | Journal article |
Språk: | English |
Publicerad: |
IEEE
2021
|