Văn bản này: Self-supervised learning of structural representations of visual objects