Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction
Relation extraction from biological publications plays a pivotal role in accelerating scientific discovery and advancing medical research. While vast amounts of this knowledge is stored within the published literature, extracting it manually from this continually growing volume of documents is becom...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Informatics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-9709/10/4/89 |
_version_ | 1797380593416142848 |
---|---|
author | Jaskaran Gill Madhu Chetty Suryani Lim Jennifer Hallinan |
author_facet | Jaskaran Gill Madhu Chetty Suryani Lim Jennifer Hallinan |
author_sort | Jaskaran Gill |
collection | DOAJ |
description | Relation extraction from biological publications plays a pivotal role in accelerating scientific discovery and advancing medical research. While vast amounts of this knowledge is stored within the published literature, extracting it manually from this continually growing volume of documents is becoming increasingly arduous. Recently, attention has been focused towards automatically extracting such knowledge using pre-trained Large Language Models (LLM) and deep-learning algorithms for automated relation extraction. However, the complex syntactic structure of biological sentences, with nested entities and domain-specific terminology, and insufficient annotated training corpora, poses major challenges in accurately capturing entity relationships from the unstructured data. To address these issues, in this paper, we propose a <b>K</b>nowledge-based <b>I</b>ntelligent <b>T</b>ext <b>S</b>implification (KITS) approach focused on the accurate extraction of biological relations. KITS is able to precisely and accurately capture the relational context among various binary relations within the sentence, alongside preventing any potential changes in meaning for those sentences being simplified by KITS. The experiments show that the proposed technique, using well-known performance metrics, resulted in a 21% increase in precision, with only 25% of sentences simplified in the Learning Language in Logic (LLL) dataset. Combining the proposed method with BioBERT, the popular pre-trained LLM was able to outperform other state-of-the-art methods. |
first_indexed | 2024-03-08T20:40:33Z |
format | Article |
id | doaj.art-1ddbcedea6cf41579092181c67614499 |
institution | Directory Open Access Journal |
issn | 2227-9709 |
language | English |
last_indexed | 2024-03-08T20:40:33Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Informatics |
spelling | doaj.art-1ddbcedea6cf41579092181c676144992023-12-22T14:15:47ZengMDPI AGInformatics2227-97092023-12-011048910.3390/informatics10040089Knowledge-Based Intelligent Text Simplification for Biological Relation ExtractionJaskaran Gill0Madhu Chetty1Suryani Lim2Jennifer Hallinan3Health Innovation and Transformation Centre, Federation University, Ballarat, VIC 3842, AustraliaHealth Innovation and Transformation Centre, Federation University, Ballarat, VIC 3842, AustraliaHealth Innovation and Transformation Centre, Federation University, Ballarat, VIC 3842, AustraliaHealth Innovation and Transformation Centre, Federation University, Ballarat, VIC 3842, AustraliaRelation extraction from biological publications plays a pivotal role in accelerating scientific discovery and advancing medical research. While vast amounts of this knowledge is stored within the published literature, extracting it manually from this continually growing volume of documents is becoming increasingly arduous. Recently, attention has been focused towards automatically extracting such knowledge using pre-trained Large Language Models (LLM) and deep-learning algorithms for automated relation extraction. However, the complex syntactic structure of biological sentences, with nested entities and domain-specific terminology, and insufficient annotated training corpora, poses major challenges in accurately capturing entity relationships from the unstructured data. To address these issues, in this paper, we propose a <b>K</b>nowledge-based <b>I</b>ntelligent <b>T</b>ext <b>S</b>implification (KITS) approach focused on the accurate extraction of biological relations. KITS is able to precisely and accurately capture the relational context among various binary relations within the sentence, alongside preventing any potential changes in meaning for those sentences being simplified by KITS. The experiments show that the proposed technique, using well-known performance metrics, resulted in a 21% increase in precision, with only 25% of sentences simplified in the Learning Language in Logic (LLL) dataset. Combining the proposed method with BioBERT, the popular pre-trained LLM was able to outperform other state-of-the-art methods.https://www.mdpi.com/2227-9709/10/4/89sentence simplificationnamed entity recognitionrelation extractionBioBERTBERN2 |
spellingShingle | Jaskaran Gill Madhu Chetty Suryani Lim Jennifer Hallinan Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction Informatics sentence simplification named entity recognition relation extraction BioBERT BERN2 |
title | Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction |
title_full | Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction |
title_fullStr | Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction |
title_full_unstemmed | Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction |
title_short | Knowledge-Based Intelligent Text Simplification for Biological Relation Extraction |
title_sort | knowledge based intelligent text simplification for biological relation extraction |
topic | sentence simplification named entity recognition relation extraction BioBERT BERN2 |
url | https://www.mdpi.com/2227-9709/10/4/89 |
work_keys_str_mv | AT jaskarangill knowledgebasedintelligenttextsimplificationforbiologicalrelationextraction AT madhuchetty knowledgebasedintelligenttextsimplificationforbiologicalrelationextraction AT suryanilim knowledgebasedintelligenttextsimplificationforbiologicalrelationextraction AT jenniferhallinan knowledgebasedintelligenttextsimplificationforbiologicalrelationextraction |