Knowledge-aware representation learning for natural language processing applications

Natural Language Processing (NLP) stands as a vital subfield of artificial intelligence, empowering computers to interpret and understand human language. In recent years, NLP has seamlessly integrated into our daily lives, with applications spanning sentiment analysis, text classification, and dialo...

Full description

Bibliographic Details
Main Author: Zhang, Jiaheng
Other Authors: Mao Kezhi
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171068
Description
Summary:Natural Language Processing (NLP) stands as a vital subfield of artificial intelligence, empowering computers to interpret and understand human language. In recent years, NLP has seamlessly integrated into our daily lives, with applications spanning sentiment analysis, text classification, and dialogue systems. The bedrock of success in NLP applications lies in the domain of text representation. Robust text representations extract critical, distinguishing, and meaningful information from raw textual data. The advent of deep learning models has transformed NLP, outperforming conventional rule-based systems. These models typically encompass a sequence of stages, including raw data preprocessing, feature extraction, and classification. Nonetheless, challenges persist. Overfitting is a common pitfall, primarily stemming from the limited diversity and formatting of raw data. Additionally, the interpretability of data-driven approaches remains a stumbling block for real-world deployment. A promising avenue for addressing these challenges involves the infusion of external knowledge into NLP models. This thesis embarks on this journey by exploring knowledge-enriched solutions across three pivotal NLP applications: Sentiment Analysis: External knowledge, sourced from sentiment-related lexicons such as WordNet, is seamlessly incorporated with Siamese networks into conventional deep learning models. This integration enhances sentiment analysis, enabling a nuanced understanding of emotional nuances. Text Classification: A novel multi-scaled topic embedding methodology is introduced, effectively merging external knowledge sources with deep neural networks. The outcome is a significant boost in text classification accuracy, capitalizing on domain-specific insights. Answer Selection: Four distinct network architectures leveraging topic embeddings are proposed, yielding superior text representations. These networks prove instrumental in enhancing answer selection tasks. By delving into knowledge-enriched representation learning within NLP applications, this thesis presents innovative methodologies tailored to the unique characteristics of each application. Empirical findings underscore the efficacy of these knowledge-enriched approaches in enhancing baseline systems, thereby paving the path for future research in the field.