A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks
Effective information extraction of pharmaceutical texts is of great significance for clinical research. The ancient Chinese medicine text has streamlined sentences and complex semantic relationships, and the textual relationships may exist between heterogeneous entities. The current mainstream rela...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2024-01-01
|
Series: | Mathematical Biosciences and Engineering |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/mbe.2024064?viewType=HTML |
_version_ | 1797326302887280640 |
---|---|
author | Shuilong Zou Zhaoyang Liu Kaiqi Wang Jun Cao Shixiong Liu Wangping Xiong Shaoyi Li |
author_facet | Shuilong Zou Zhaoyang Liu Kaiqi Wang Jun Cao Shixiong Liu Wangping Xiong Shaoyi Li |
author_sort | Shuilong Zou |
collection | DOAJ |
description | Effective information extraction of pharmaceutical texts is of great significance for clinical research. The ancient Chinese medicine text has streamlined sentences and complex semantic relationships, and the textual relationships may exist between heterogeneous entities. The current mainstream relationship extraction model does not take into account the associations between entities and relationships when extracting, resulting in insufficient semantic information to form an effective structured representation. In this paper, we propose a heterogeneous graph neural network relationship extraction model adapted to traditional Chinese medicine (TCM) text. First, the given sentence and predefined relationships are embedded by bidirectional encoder representation from transformers (BERT fine-tuned) word embedding as model input. Second, a heterogeneous graph network is constructed to associate words, phrases, and relationship nodes to obtain the hidden layer representation. Then, in the decoding stage, two-stage subject-object entity identification method is adopted, and the identifier adopts a binary classifier to locate the start and end positions of the TCM entities, identifying all the subject-object entities in the sentence, and finally forming the TCM entity relationship group. Through the experiments on the TCM relationship extraction dataset, the results show that the precision value of the heterogeneous graph neural network embedded with BERT is 86.99% and the F1 value reaches 87.40%, which is improved by 8.83% and 10.21% compared with the relationship extraction models CNN, Bert-CNN, and Graph LSTM. |
first_indexed | 2024-03-08T06:22:22Z |
format | Article |
id | doaj.art-7a1b084005f5462b876df916494f3156 |
institution | Directory Open Access Journal |
issn | 1551-0018 |
language | English |
last_indexed | 2024-03-08T06:22:22Z |
publishDate | 2024-01-01 |
publisher | AIMS Press |
record_format | Article |
series | Mathematical Biosciences and Engineering |
spelling | doaj.art-7a1b084005f5462b876df916494f31562024-02-04T01:33:52ZengAIMS PressMathematical Biosciences and Engineering1551-00182024-01-012111489150710.3934/mbe.2024064A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networksShuilong Zou0Zhaoyang Liu1Kaiqi Wang2Jun Cao3Shixiong Liu4Wangping Xiong 5Shaoyi Li 61. Nanchang Institute of science & Technology, Nanchang 330004, China2. School of Computer, Jiangxi University of Chinese Medicine, Nanchang 330004, China2. School of Computer, Jiangxi University of Chinese Medicine, Nanchang 330004, China2. School of Computer, Jiangxi University of Chinese Medicine, Nanchang 330004, China1. Nanchang Institute of science & Technology, Nanchang 330004, China2. School of Computer, Jiangxi University of Chinese Medicine, Nanchang 330004, China1. Nanchang Institute of science & Technology, Nanchang 330004, ChinaEffective information extraction of pharmaceutical texts is of great significance for clinical research. The ancient Chinese medicine text has streamlined sentences and complex semantic relationships, and the textual relationships may exist between heterogeneous entities. The current mainstream relationship extraction model does not take into account the associations between entities and relationships when extracting, resulting in insufficient semantic information to form an effective structured representation. In this paper, we propose a heterogeneous graph neural network relationship extraction model adapted to traditional Chinese medicine (TCM) text. First, the given sentence and predefined relationships are embedded by bidirectional encoder representation from transformers (BERT fine-tuned) word embedding as model input. Second, a heterogeneous graph network is constructed to associate words, phrases, and relationship nodes to obtain the hidden layer representation. Then, in the decoding stage, two-stage subject-object entity identification method is adopted, and the identifier adopts a binary classifier to locate the start and end positions of the TCM entities, identifying all the subject-object entities in the sentence, and finally forming the TCM entity relationship group. Through the experiments on the TCM relationship extraction dataset, the results show that the precision value of the heterogeneous graph neural network embedded with BERT is 86.99% and the F1 value reaches 87.40%, which is improved by 8.83% and 10.21% compared with the relationship extraction models CNN, Bert-CNN, and Graph LSTM.https://www.aimspress.com/article/doi/10.3934/mbe.2024064?viewType=HTMLmedical textrelation extractionbertheterogeneous graph neural networks |
spellingShingle | Shuilong Zou Zhaoyang Liu Kaiqi Wang Jun Cao Shixiong Liu Wangping Xiong Shaoyi Li A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks Mathematical Biosciences and Engineering medical text relation extraction bert heterogeneous graph neural networks |
title | A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
title_full | A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
title_fullStr | A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
title_full_unstemmed | A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
title_short | A study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
title_sort | study on pharmaceutical text relationship extraction based on heterogeneous graph neural networks |
topic | medical text relation extraction bert heterogeneous graph neural networks |
url | https://www.aimspress.com/article/doi/10.3934/mbe.2024064?viewType=HTML |
work_keys_str_mv | AT shuilongzou astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT zhaoyangliu astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT kaiqiwang astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT juncao astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT shixiongliu astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT wangpingxiong astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT shaoyili astudyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT shuilongzou studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT zhaoyangliu studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT kaiqiwang studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT juncao studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT shixiongliu studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT wangpingxiong studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks AT shaoyili studyonpharmaceuticaltextrelationshipextractionbasedonheterogeneousgraphneuralnetworks |