Summary: | The focus of this study is to provide a model to be used for the identification of sentiments of comments about education and profession life of software engineering in social media and microblogging sites. Such a pre-trained model can be useful to evaluate students’ and software engineers’ feedbacks about software engineering. This problem is considered as a supervised text classification problem, which thereby requires a dataset for the training process. To do so, a survey is conducted among students of a software engineering department. In the classification phase, we represent the corpus by using conventional and word-embedding text representation schemes and yield accuracy, recall and precision results by using conventional supervised machine learning classifiers and well-known deep learning architectures. In the experimental analysis, first we focus on achieving classification results by using three conventional text representation schemes and three N-gram models in conjunction with five classifiers (i.e., naïve bayes, k-nearest neighbor algorithm, support vector machines, random forest and logistic regression). In addition, we evaluate the performances of three ensemble learners and three deep learning architectures (i.e. convolutional neural network, recurrent neural network, and long short-term memory). The empirical results indicate that deep learning architectures outperform conventional supervised machine learning classifiers and ensemble learners.
|