Denoising Graph Inference Network for Document-Level Relation Extraction

Relation Extraction (RE) is to obtain a predefined relation type of two entities mentioned in a piece of text, e.g., a sentence-level or a document-level text. Most existing studies suffer from the noise in the text, and necessary pruning is of great importance. The conventional sentence-level RE ta...

Full description

Bibliographic Details
Main Authors: Hailin Wang, Ke Qin, Guiduo Duan, Guangchun Luo
Format: Article
Language:English
Published: Tsinghua University Press 2023-06-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2022.9020051
_version_ 1797902876033417216
author Hailin Wang
Ke Qin
Guiduo Duan
Guangchun Luo
author_facet Hailin Wang
Ke Qin
Guiduo Duan
Guangchun Luo
author_sort Hailin Wang
collection DOAJ
description Relation Extraction (RE) is to obtain a predefined relation type of two entities mentioned in a piece of text, e.g., a sentence-level or a document-level text. Most existing studies suffer from the noise in the text, and necessary pruning is of great importance. The conventional sentence-level RE task addresses this issue by a denoising method using the shortest dependency path to build a long-range semantic dependency between entity pairs. However, this kind of denoising method is scarce in document-level RE. In this work, we explicitly model a denoised document-level graph based on linguistic knowledge to capture various long-range semantic dependencies among entities. We first formalize a Syntactic Dependency Tree forest (SDT-forest) by introducing the syntax and discourse dependency relation. Then, the Steiner tree algorithm extracts a mention-level denoised graph, Steiner Graph (SG), removing linguistically irrelevant words from the SDT-forest. We then devise a slide residual attention to highlight word-level evidence on text and SG. Finally, the classification is established on the SG to infer the relations of entity pairs. We conduct extensive experiments on three public datasets. The results evidence that our method is beneficial to establish long-range semantic dependency and can improve the classification performance with longer texts.
first_indexed 2024-04-10T09:24:10Z
format Article
id doaj.art-40d4b8484f354074b650b0adcab82842
institution Directory Open Access Journal
issn 2096-0654
language English
last_indexed 2024-04-10T09:24:10Z
publishDate 2023-06-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj.art-40d4b8484f354074b650b0adcab828422023-02-20T07:01:55ZengTsinghua University PressBig Data Mining and Analytics2096-06542023-06-016224826210.26599/BDMA.2022.9020051Denoising Graph Inference Network for Document-Level Relation ExtractionHailin Wang0Ke Qin1Guiduo Duan2Guangchun Luo3School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaRelation Extraction (RE) is to obtain a predefined relation type of two entities mentioned in a piece of text, e.g., a sentence-level or a document-level text. Most existing studies suffer from the noise in the text, and necessary pruning is of great importance. The conventional sentence-level RE task addresses this issue by a denoising method using the shortest dependency path to build a long-range semantic dependency between entity pairs. However, this kind of denoising method is scarce in document-level RE. In this work, we explicitly model a denoised document-level graph based on linguistic knowledge to capture various long-range semantic dependencies among entities. We first formalize a Syntactic Dependency Tree forest (SDT-forest) by introducing the syntax and discourse dependency relation. Then, the Steiner tree algorithm extracts a mention-level denoised graph, Steiner Graph (SG), removing linguistically irrelevant words from the SDT-forest. We then devise a slide residual attention to highlight word-level evidence on text and SG. Finally, the classification is established on the SG to infer the relations of entity pairs. We conduct extensive experiments on three public datasets. The results evidence that our method is beneficial to establish long-range semantic dependency and can improve the classification performance with longer texts.https://www.sciopen.com/article/10.26599/BDMA.2022.9020051relation eextraction (re)document-leveldenoisinglinguistic knowledgeattention mechanism
spellingShingle Hailin Wang
Ke Qin
Guiduo Duan
Guangchun Luo
Denoising Graph Inference Network for Document-Level Relation Extraction
Big Data Mining and Analytics
relation eextraction (re)
document-level
denoising
linguistic knowledge
attention mechanism
title Denoising Graph Inference Network for Document-Level Relation Extraction
title_full Denoising Graph Inference Network for Document-Level Relation Extraction
title_fullStr Denoising Graph Inference Network for Document-Level Relation Extraction
title_full_unstemmed Denoising Graph Inference Network for Document-Level Relation Extraction
title_short Denoising Graph Inference Network for Document-Level Relation Extraction
title_sort denoising graph inference network for document level relation extraction
topic relation eextraction (re)
document-level
denoising
linguistic knowledge
attention mechanism
url https://www.sciopen.com/article/10.26599/BDMA.2022.9020051
work_keys_str_mv AT hailinwang denoisinggraphinferencenetworkfordocumentlevelrelationextraction
AT keqin denoisinggraphinferencenetworkfordocumentlevelrelationextraction
AT guiduoduan denoisinggraphinferencenetworkfordocumentlevelrelationextraction
AT guangchunluo denoisinggraphinferencenetworkfordocumentlevelrelationextraction