Language models for ontology engineering

<p>Ontology, originally a philosophical term, refers to the study of being and existence. The concept was introduced to Artificial Intelligence (AI) as a knowledge-based system that can model and share knowledge about entities and their relationships in a machine-readable format. Ontologies of...

Deskribapen osoa

Xehetasun bibliografikoak
Egile nagusia:	He, Y
Beste egile batzuk:	Horrocks, I
Formatua:	Thesis
Hizkuntza:	English
Argitaratua:	2024
Gaiak:	Deep learning (Machine learning) Modeling languages (Computer science) Ontology OWL (Web ontology language) Natural language processing (Computer science) Artificial intelligence

_version_	1826314656104316928
author	He, Y
author2	Horrocks, I
author_facet	Horrocks, I He, Y
author_sort	He, Y
collection	OXFORD
description	<p>Ontology, originally a philosophical term, refers to the study of being and existence. The concept was introduced to Artificial Intelligence (AI) as a knowledge-based system that can model and share knowledge about entities and their relationships in a machine-readable format. Ontologies offer a structured and logical formalism of human knowledge, enabling expressive representations and reliable reasoning within defined domains. Meanwhile, modern deep learning-based language models (LMs) represent a significant milestone in the field of Natural Language Processing (NLP), as they incorporate substantial background knowledge from the vast and complex distribution of textual data. This thesis explores the synergy between these two paradigms, focusing primarily on the use of LMs in ontology engineering and, more broadly, in knowledge engineering. The goal is to automate or semi-automate the process of ontology construction and curation.</p> <p>Ontology engineering includes a wide array of tasks within the life cycle of ontology development. This thesis concentrates on three key aspects: (<em>i</em>) ontology alignment, which seeks to align equivalent concepts across different ontologies to achieve data integration; (<em>ii</em>) ontology completion, which focuses on filling in missing subsumption relationships between ontology concepts; and (<em>iii</em>) hierarchy embedding, which aims to develop versatile and interpretable neural representations for hierarchical structures derived not only from ontologies but also applicable to other forms of hierarchical data. These representations can facilitate a broad spectrum of downstream ontology engineering tasks, such as (<em>i</em>) and (<em>ii</em>), and are adaptable for more general applications in hierarchy-aware contexts.</p> <p>This thesis is organised into three parts. The first part establishes the foundations necessary for understanding ontologies and LMs. The chapter on ontologies initiates with a basic overview of computational ontologies, then provides an introduction of the description logic formalisms that underpin them. It concludes with the formal definitions of the three ontology engineering tasks this thesis focuses on. Transitioning to LMs, the subsequent chapter begins with a chronological overview of their evolution, followed by detailed exposition of various typical LMs along this evolution. The discussion then proceeds to contemporary transformer-based LMs, elaborating on their architecture and different learning paradigms they adopt. The chapter concludes with a review of how LMs and knowledge bases (including ontologies) interact and influence each other, highlighting the mutual benefits of this integration for both fields of study.</p> <p>With the comprehensive background provided in the first part, the second part of the thesis delves into specific methodologies that have been developed. This part comprises three chapters, each corresponding to the application of LMs in ontology alignment, ontology completion, and hierarchy embedding, respectively. In the chapter on LMs for ontology alignment, we introduce BERTMap, a novel pipeline system that employs LM fine-tuning for improved alignment prediction and ontology semantics for alignment refinement. We will also mention the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI), which has emerged as a benchmarking platform for a variety of ontology alignment systems over the past two years. The chapter on LMs for ontology completion presents OntoLAMA, a collection of LM probing datasets and a prompt-based LM probing approach that effectively predicts subsumptions, even with limited training resources. Lastly, the section on LMs for hierarchy embedding discusses the re-training of LMs as Hierarchy Transformer encoders (HiT), addressing the limitations of LMs in explicitly interpreting and encoding hierarchies, including those extracted from ontologies.</p> <p>The third part of the thesis details the practical implementations. We mainly present DeepOnto, a Python package designed for ontology engineering utilising deep learning, with an emphasis on LMs. DeepOnto offers a range of basic to advanced ontology processing functionalities to support deep learning-based ontology engineering development. This package also includes polished implementations of our systems and resources mentioned in Part II.</p> <p>In summary, this thesis advocates for a more holistic approach in AI development, where the integration of LMs and ontologies can lead to a more advanced, explainable, and useful paradigm in knowledge engineering and beyond.</p>
first_indexed	2024-09-25T04:36:31Z
format	Thesis
id	oxford-uuid:e9a2c06d-79ce-4652-b561-91dd56acee4f
institution	University of Oxford
language	English
last_indexed	2024-12-09T03:10:37Z
publishDate	2024
record_format	dspace
spelling	oxford-uuid:e9a2c06d-79ce-4652-b561-91dd56acee4f2024-09-26T09:11:16ZLanguage models for ontology engineeringThesishttp://purl.org/coar/resource_type/c_db06uuid:e9a2c06d-79ce-4652-b561-91dd56acee4fDeep learning (Machine learning)Modeling languages (Computer science)OntologyOWL (Web ontology language)Natural language processing (Computer science)Artificial intelligenceEnglishHyrax Deposit2024He, YHorrocks, ICuenca Grau, BChen, J<p>Ontology, originally a philosophical term, refers to the study of being and existence. The concept was introduced to Artificial Intelligence (AI) as a knowledge-based system that can model and share knowledge about entities and their relationships in a machine-readable format. Ontologies offer a structured and logical formalism of human knowledge, enabling expressive representations and reliable reasoning within defined domains. Meanwhile, modern deep learning-based language models (LMs) represent a significant milestone in the field of Natural Language Processing (NLP), as they incorporate substantial background knowledge from the vast and complex distribution of textual data. This thesis explores the synergy between these two paradigms, focusing primarily on the use of LMs in ontology engineering and, more broadly, in knowledge engineering. The goal is to automate or semi-automate the process of ontology construction and curation.</p> <p>Ontology engineering includes a wide array of tasks within the life cycle of ontology development. This thesis concentrates on three key aspects: (<em>i</em>) ontology alignment, which seeks to align equivalent concepts across different ontologies to achieve data integration; (<em>ii</em>) ontology completion, which focuses on filling in missing subsumption relationships between ontology concepts; and (<em>iii</em>) hierarchy embedding, which aims to develop versatile and interpretable neural representations for hierarchical structures derived not only from ontologies but also applicable to other forms of hierarchical data. These representations can facilitate a broad spectrum of downstream ontology engineering tasks, such as (<em>i</em>) and (<em>ii</em>), and are adaptable for more general applications in hierarchy-aware contexts.</p> <p>This thesis is organised into three parts. The first part establishes the foundations necessary for understanding ontologies and LMs. The chapter on ontologies initiates with a basic overview of computational ontologies, then provides an introduction of the description logic formalisms that underpin them. It concludes with the formal definitions of the three ontology engineering tasks this thesis focuses on. Transitioning to LMs, the subsequent chapter begins with a chronological overview of their evolution, followed by detailed exposition of various typical LMs along this evolution. The discussion then proceeds to contemporary transformer-based LMs, elaborating on their architecture and different learning paradigms they adopt. The chapter concludes with a review of how LMs and knowledge bases (including ontologies) interact and influence each other, highlighting the mutual benefits of this integration for both fields of study.</p> <p>With the comprehensive background provided in the first part, the second part of the thesis delves into specific methodologies that have been developed. This part comprises three chapters, each corresponding to the application of LMs in ontology alignment, ontology completion, and hierarchy embedding, respectively. In the chapter on LMs for ontology alignment, we introduce BERTMap, a novel pipeline system that employs LM fine-tuning for improved alignment prediction and ontology semantics for alignment refinement. We will also mention the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI), which has emerged as a benchmarking platform for a variety of ontology alignment systems over the past two years. The chapter on LMs for ontology completion presents OntoLAMA, a collection of LM probing datasets and a prompt-based LM probing approach that effectively predicts subsumptions, even with limited training resources. Lastly, the section on LMs for hierarchy embedding discusses the re-training of LMs as Hierarchy Transformer encoders (HiT), addressing the limitations of LMs in explicitly interpreting and encoding hierarchies, including those extracted from ontologies.</p> <p>The third part of the thesis details the practical implementations. We mainly present DeepOnto, a Python package designed for ontology engineering utilising deep learning, with an emphasis on LMs. DeepOnto offers a range of basic to advanced ontology processing functionalities to support deep learning-based ontology engineering development. This package also includes polished implementations of our systems and resources mentioned in Part II.</p> <p>In summary, this thesis advocates for a more holistic approach in AI development, where the integration of LMs and ontologies can lead to a more advanced, explainable, and useful paradigm in knowledge engineering and beyond.</p>
spellingShingle	Deep learning (Machine learning) Modeling languages (Computer science) Ontology OWL (Web ontology language) Natural language processing (Computer science) Artificial intelligence He, Y Language models for ontology engineering
title	Language models for ontology engineering
title_full	Language models for ontology engineering
title_fullStr	Language models for ontology engineering
title_full_unstemmed	Language models for ontology engineering
title_short	Language models for ontology engineering
title_sort	language models for ontology engineering
topic	Deep learning (Machine learning) Modeling languages (Computer science) Ontology OWL (Web ontology language) Natural language processing (Computer science) Artificial intelligence
work_keys_str_mv	AT hey languagemodelsforontologyengineering

Language models for ontology engineering

Antzeko izenburuak