Knowledge graph embedding with deep learning

Knowledge graphs (KGs) are widely used to represent structured knowledge, such as entities and their relationships, in applications like natural language processing, information retrieval, and recommendation systems. However, real-world domains are complex, leading to incomplete and error-prone KGs....

Full description

Bibliographic Details
Main Author: Chen, Chen
Other Authors: Lam Kwok Yan
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173397
Description
Summary:Knowledge graphs (KGs) are widely used to represent structured knowledge, such as entities and their relationships, in applications like natural language processing, information retrieval, and recommendation systems. However, real-world domains are complex, leading to incomplete and error-prone KGs. Knowledge graph completion (KGC) addresses this by predicting missing links and improving KG quality. Knowledge graph embedding (KGE) is a promising approach for KGC, converting KG data into low-dimensional vector representations using deep learning and other techniques. This thesis focuses on deep learning methods for knowledge graph embedding. In the first place, we place our emphasis on the graph-based KGC methods. Existing graph-based methods for KGC generally learn continuous embeddings for entities and relations with shallow linear transformations or deep convolutional modules. These methods suffer from poor expressiveness issues or impose unnecessary image-specific inductive bias to the KGC embedding models, which potentially degrade the model performance. To avoid these issues, we propose a Transformer-based Patch Refinement Model (PatReFormer) under a “Separate-and-Aggregate” framework which segments the input entity and relation embeddings into patches, and utilizes a cross-attentive Transformer architecture for aggregation. Secondly, we start to consider incorporating textual information such as entity / relation description for KGC, and propose a PLM-based method by using an encoder-only structure. The recently-proposed fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge. To address this issue, we propose CSProm-KG (Conditional Soft Prompts for KGC) which maintains a balance between structural information and textual knowledge. CSProm-KG only tunes the parameters of Conditional Soft Prompts that are generated by the entities and relations representations and freeze the parameters in PLM. In this way, our proposed approach would be able to consider both information equally and effectively during the KGC process. Thirdly, rather than relying on an encoder-only system to utilize and learn KG textual information, we propose a novel approach based on the sequence-to-sequence paradigm for directly predicting the target entity text. Existing solutions for KGC often cater to specific graph structures, resulting in incompatible methods for different KGC tasks. Such methodological discrepancies not only incur significant maintenance costs but also hinder adaptability to evolving knowledge queries, ingestion processes, and presentation requirements. To address these challenges, we leverage the exceptional performance and technical homogeneity demonstrated by Seq2Seq Pre-trained Language Models (PLMs) across various NLP tasks. We introduce a straightforward yet highly effective Seq2Seq PLM framework, called KG-S2S, that exhibits adaptability to diverse knowledge graph structures. Lastly, we extend the application of KGC techniques to address the challenges in the context of Internet of Things (IoT) services. IoT profiling has recently gained attention as a promising method for validating the normal behavior of connected devices in these services. However, a significant challenge is how to effectively process the vast amounts of IoT profiles to identify suspicious devices which require closer monitoring. To tackle this challenge, we propose a holistic and novel framework HABIT, which regards the behaviors of connected devices as a KG, and detect the “false” knowledge using KGC techniques. By introducing the power of cutting-edge KGC techniques, HABIT offers a comprehensive profiling approach for accurately identifying anomalous behaviors in IoT services.