Отправить по sms: Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation