Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation
In this paper, we propose a fully automated system to extend knowledge graphs using external information from web-scale corpora. The designed system leverages a deep-learning-based technology for relation extraction that can be trained by a distantly supervised approach. In addition, the system uses...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/12/8/316 |
_version_ | 1797523506892636160 |
---|---|
author | Sarthak Dash Michael R. Glass Alfio Gliozzo Mustafa Canim Gaetano Rossiello |
author_facet | Sarthak Dash Michael R. Glass Alfio Gliozzo Mustafa Canim Gaetano Rossiello |
author_sort | Sarthak Dash |
collection | DOAJ |
description | In this paper, we propose a fully automated system to extend knowledge graphs using external information from web-scale corpora. The designed system leverages a deep-learning-based technology for relation extraction that can be trained by a distantly supervised approach. In addition, the system uses a deep learning approach for knowledge base completion by utilizing the global structure information of the induced KG to further refine the confidence of the newly discovered relations. The designed system does not require any effort for adaptation to new languages and domains as it does not use any hand-labeled data, NLP analytics, and inference rules. Our experiments, performed on a popular academic benchmark, demonstrate that the suggested system boosts the performance of relation extraction by a wide margin, reporting error reductions of 50%, resulting in relative improvement of up to 100%. Furthermore, a web-scale experiment conducted to extend DBPedia with knowledge from Common Crawl shows that our system is not only scalable but also does not require any adaptation cost, while yielding a substantial accuracy gain. |
first_indexed | 2024-03-10T08:43:58Z |
format | Article |
id | doaj.art-7bdd82f8c8b746c1a3965e3af319567e |
institution | Directory Open Access Journal |
issn | 2078-2489 |
language | English |
last_indexed | 2024-03-10T08:43:58Z |
publishDate | 2021-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Information |
spelling | doaj.art-7bdd82f8c8b746c1a3965e3af319567e2023-11-22T08:06:00ZengMDPI AGInformation2078-24892021-08-0112831610.3390/info12080316Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and ValidationSarthak Dash0Michael R. Glass1Alfio Gliozzo2Mustafa Canim3Gaetano Rossiello4IBM Research AI, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USAIBM Research AI, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USAIBM Research AI, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USAIBM Research AI, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USAIBM Research AI, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USAIn this paper, we propose a fully automated system to extend knowledge graphs using external information from web-scale corpora. The designed system leverages a deep-learning-based technology for relation extraction that can be trained by a distantly supervised approach. In addition, the system uses a deep learning approach for knowledge base completion by utilizing the global structure information of the induced KG to further refine the confidence of the newly discovered relations. The designed system does not require any effort for adaptation to new languages and domains as it does not use any hand-labeled data, NLP analytics, and inference rules. Our experiments, performed on a popular academic benchmark, demonstrate that the suggested system boosts the performance of relation extraction by a wide margin, reporting error reductions of 50%, resulting in relative improvement of up to 100%. Furthermore, a web-scale experiment conducted to extend DBPedia with knowledge from Common Crawl shows that our system is not only scalable but also does not require any adaptation cost, while yielding a substantial accuracy gain.https://www.mdpi.com/2078-2489/12/8/316information extractionknowledge graphsdeep learning |
spellingShingle | Sarthak Dash Michael R. Glass Alfio Gliozzo Mustafa Canim Gaetano Rossiello Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation Information information extraction knowledge graphs deep learning |
title | Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation |
title_full | Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation |
title_fullStr | Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation |
title_full_unstemmed | Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation |
title_short | Populating Web-Scale Knowledge Graphs Using Distantly Supervised Relation Extraction and Validation |
title_sort | populating web scale knowledge graphs using distantly supervised relation extraction and validation |
topic | information extraction knowledge graphs deep learning |
url | https://www.mdpi.com/2078-2489/12/8/316 |
work_keys_str_mv | AT sarthakdash populatingwebscaleknowledgegraphsusingdistantlysupervisedrelationextractionandvalidation AT michaelrglass populatingwebscaleknowledgegraphsusingdistantlysupervisedrelationextractionandvalidation AT alfiogliozzo populatingwebscaleknowledgegraphsusingdistantlysupervisedrelationextractionandvalidation AT mustafacanim populatingwebscaleknowledgegraphsusingdistantlysupervisedrelationextractionandvalidation AT gaetanorossiello populatingwebscaleknowledgegraphsusingdistantlysupervisedrelationextractionandvalidation |