Distant Supervision for Relation Extraction with Ranking-Based Methods
Relation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2016-05-01
|
Series: | Entropy |
Subjects: | |
Online Access: | http://www.mdpi.com/1099-4300/18/6/204 |
_version_ | 1828113992322646016 |
---|---|
author | Yang Xiang Qingcai Chen Xiaolong Wang Yang Qin |
author_facet | Yang Xiang Qingcai Chen Xiaolong Wang Yang Qin |
author_sort | Yang Xiang |
collection | DOAJ |
description | Relation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the heavy cost of data annotation. In this paper, we construct an architecture called MIML-sort (Multi-instance Multi-label Learning with Sorting Strategies), which is built on the famous MIML framework. Based on MIML-sort, we propose three ranking-based methods for sample selection with which we identify relation extractors from a subset of the training data. Experiments are set up on the KBP (Knowledge Base Propagation) corpus, one of the benchmark datasets for distant supervision, which is large and noisy. Compared with previous work, the proposed methods produce considerably better results. Furthermore, the three methods together achieve the best F1 on the official testing set, with an optimal enhancement of F1 from 27.3% to 29.98%. |
first_indexed | 2024-04-11T12:18:41Z |
format | Article |
id | doaj.art-d1dadd29ef684e2aa19b7e8c24c37ee3 |
institution | Directory Open Access Journal |
issn | 1099-4300 |
language | English |
last_indexed | 2024-04-11T12:18:41Z |
publishDate | 2016-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Entropy |
spelling | doaj.art-d1dadd29ef684e2aa19b7e8c24c37ee32022-12-22T04:24:11ZengMDPI AGEntropy1099-43002016-05-0118620410.3390/e18060204e18060204Distant Supervision for Relation Extraction with Ranking-Based MethodsYang Xiang0Qingcai Chen1Xiaolong Wang2Yang Qin3Intelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaIntelligence Computing Research Center, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, ChinaRelation extraction has benefited from distant supervision in recent years with the development of natural language processing techniques and data explosion. However, distant supervision is still greatly limited by the quality of training data, due to its natural motivation for greatly reducing the heavy cost of data annotation. In this paper, we construct an architecture called MIML-sort (Multi-instance Multi-label Learning with Sorting Strategies), which is built on the famous MIML framework. Based on MIML-sort, we propose three ranking-based methods for sample selection with which we identify relation extractors from a subset of the training data. Experiments are set up on the KBP (Knowledge Base Propagation) corpus, one of the benchmark datasets for distant supervision, which is large and noisy. Compared with previous work, the proposed methods produce considerably better results. Furthermore, the three methods together achieve the best F1 on the official testing set, with an optimal enhancement of F1 from 27.3% to 29.98%.http://www.mdpi.com/1099-4300/18/6/204distant supervisionrelation extractionmulti-instance multi-label learningranking |
spellingShingle | Yang Xiang Qingcai Chen Xiaolong Wang Yang Qin Distant Supervision for Relation Extraction with Ranking-Based Methods Entropy distant supervision relation extraction multi-instance multi-label learning ranking |
title | Distant Supervision for Relation Extraction with Ranking-Based Methods |
title_full | Distant Supervision for Relation Extraction with Ranking-Based Methods |
title_fullStr | Distant Supervision for Relation Extraction with Ranking-Based Methods |
title_full_unstemmed | Distant Supervision for Relation Extraction with Ranking-Based Methods |
title_short | Distant Supervision for Relation Extraction with Ranking-Based Methods |
title_sort | distant supervision for relation extraction with ranking based methods |
topic | distant supervision relation extraction multi-instance multi-label learning ranking |
url | http://www.mdpi.com/1099-4300/18/6/204 |
work_keys_str_mv | AT yangxiang distantsupervisionforrelationextractionwithrankingbasedmethods AT qingcaichen distantsupervisionforrelationextractionwithrankingbasedmethods AT xiaolongwang distantsupervisionforrelationextractionwithrankingbasedmethods AT yangqin distantsupervisionforrelationextractionwithrankingbasedmethods |