Summary: | The Internet has seen substantial growth of regional language data in recent years. It enables people to express their opinion by incapacitating the language barriers. Urdu is a language used by 170.2 million people for communication. Sentiment analysis is used to get insight of people opinion. In recent years, researchers’ interest in Urdu sentiment analysis has grown. Application of deep learning methods for Urdu sentiment analysis has been least explored. There is a lot of ground to cover in terms of text processing in Urdu since it is a morphologically rich language. In this paper, we propose a framework for Urdu Text Sentiment Analysis (UTSA) by exploring deep learning techniques in combination with various word vector representations. The performance of deep learning methods such as Long Short-Term Memory (LSTM), attention-based Bidirectional LSTM (BiLSTM-ATT), Convolutional Neural Networks (CNN) and CNN-LSTM is evaluated for sentiment analysis. Stacked layers are applied in sequential model LSTM, BiLSTM-ATT, and C-LSTM. In CNN, various filters are used with single convolution layer. Role of pre-trained and unsupervised self-trained embedding models is investigated on sentiment classification task. The results obtained show that the BiLSTM-ATT outperformed other deep learning models by accomplishing 77.9% accuracy and 72.7% F1 score.
|