Text this: Multi-Head Self-Attention Gated-Dilated Convolutional Neural Network for Word Sense Disambiguation