Distant Supervision for Relation Extraction with Ranking-Based Methods

Relation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the...

Full description

Bibliographic Details
Main Authors: Yang Xiang, Qingcai Chen, Xiaolong Wang, Yang Qin
Format: Article
Language:English
Published: MDPI AG 2016-05-01
Series:Entropy
Subjects:
Online Access:http://www.mdpi.com/1099-4300/18/6/204
_version_ 1828113992322646016
author Yang Xiang
Qingcai Chen
Xiaolong Wang
Yang Qin
author_facet Yang Xiang
Qingcai Chen
Xiaolong Wang
Yang Qin
author_sort Yang Xiang
collection DOAJ
description Relation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the heavy cost of data annotation. In this paper, we construct an architecture called MIML-sort (Multi-instance Multi-label Learning with Sorting Strategies), which is built on the famous MIML framework. Based on MIML-sort, we propose three ranking-based methods for sample selection with which we identify relation extractors from a subset of the training data. Experiments are set up on the KBP (Knowledge Base Propagation) corpus, one of the benchmark datasets for distant supervision, which is large and noisy. Compared with previous work, the proposed methods produce considerably better results. Furthermore, the three methods together achieve the best F1 on the official testing set, with an optimal enhancement of F1 from 27.3% to 29.98%.
first_indexed 2024-04-11T12:18:41Z
format Article
id doaj.art-d1dadd29ef684e2aa19b7e8c24c37ee3
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-04-11T12:18:41Z
publishDate 2016-05-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-d1dadd29ef684e2aa19b7e8c24c37ee32022-12-22T04:24:11ZengMDPI AGEntropy1099-43002016-05-0118620410.3390/e18060204e18060204Distant Supervision for Relation Extraction with Ranking-Based MethodsYang Xiang0Qingcai Chen1Xiaolong Wang2Yang Qin3Intelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaRelation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the heavy cost of data annotation. In this paper, we construct an architecture called MIML-sort (Multi-instance Multi-label Learning with Sorting Strategies), which is built on the famous MIML framework. Based on MIML-sort, we propose three ranking-based methods for sample selection with which we identify relation extractors from a subset of the training data. Experiments are set up on the KBP (Knowledge Base Propagation) corpus, one of the benchmark datasets for distant supervision, which is large and noisy. Compared with previous work, the proposed methods produce considerably better results. Furthermore, the three methods together achieve the best F1 on the official testing set, with an optimal enhancement of F1 from 27.3% to 29.98%.http://www.mdpi.com/1099-4300/18/6/204distant supervisionrelation extractionmulti-instance multi-label learningranking
spellingShingle Yang Xiang
Qingcai Chen
Xiaolong Wang
Yang Qin
Distant Supervision for Relation Extraction with Ranking-Based Methods
Entropy
distant supervision
relation extraction
multi-instance multi-label learning
ranking
title Distant Supervision for Relation Extraction with Ranking-Based Methods
title_full Distant Supervision for Relation Extraction with Ranking-Based Methods
title_fullStr Distant Supervision for Relation Extraction with Ranking-Based Methods
title_full_unstemmed Distant Supervision for Relation Extraction with Ranking-Based Methods
title_short Distant Supervision for Relation Extraction with Ranking-Based Methods
title_sort distant supervision for relation extraction with ranking based methods
topic distant supervision
relation extraction
multi-instance multi-label learning
ranking
url http://www.mdpi.com/1099-4300/18/6/204
work_keys_str_mv AT yangxiang distantsupervisionforrelationextractionwithrankingbasedmethods
AT qingcaichen distantsupervisionforrelationextractionwithrankingbasedmethods
AT xiaolongwang distantsupervisionforrelationextractionwithrankingbasedmethods
AT yangqin distantsupervisionforrelationextractionwithrankingbasedmethods