Text this: Unsupervised learning of object landmarks by factorized spatial embeddings