Mining heuristic evidence sentences for more interpretable document-level relation extraction

Current research on evidence sentences is aimed at developing document-level relational extraction models with improved interpretability. Evidence sentences extracted using existing methods are often incomplete, leading to poor relationship prediction accuracy. To address this problem, we developed...

Full description

Bibliographic Details
Main Authors: Taojie Zhu, Jicang Lu, Gang Zhou, Xiaoyao Ding, Panpan Guo, Hao Wu
Format: Article
Language:English
Published: Elsevier 2023-07-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157823001970
Description
Summary:Current research on evidence sentences is aimed at developing document-level relational extraction models with improved interpretability. Evidence sentences extracted using existing methods are often incomplete, leading to poor relationship prediction accuracy. To address this problem, we developed a novel efficient heuristic rule and entity representation method. First, a heuristic rule is constructed according to the interactions between different mentions of the head and tail entities of the target entity pair, and evidence sentences are subsequently extracted. Second, pseudo documents, constructed according to the original document order, are used as input text to remove noisy statements. Finally, different representations of the same entity in different entity pairs are learned to represent it more accurately through the interactive mention of head and tail entities. Experiments on the document-level general domain dataset DocRED indicated that our heuristic rules improved sentence extraction by 6.01% compared to that achieved by the baseline model Paths-BiLSTM. In terms of relation prediction, the accuracy of the proposed method was comparable to those of existing models that use the entire document as input text; however, the input text used by the proposed method was shorter and more interpretable.
ISSN:1319-1578