Text this: Are vision transformers always more robust than convolutional neural networks?