Neural networks for open and closed Literature-based Discovery.

Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their r...

Full description

Bibliographic Details
Main Authors: Gamal Crichton, Simon Baker, Yufan Guo, Anna Korhonen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0232891
_version_ 1819199609657884672
author Gamal Crichton
Simon Baker
Yufan Guo
Anna Korhonen
author_facet Gamal Crichton
Simon Baker
Yufan Guo
Anna Korhonen
author_sort Gamal Crichton
collection DOAJ
description Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD.
first_indexed 2024-12-23T03:19:04Z
format Article
id doaj.art-36669e8595b14818b157ae528729532a
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-23T03:19:04Z
publishDate 2020-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-36669e8595b14818b157ae528729532a2022-12-21T18:02:03ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01155e023289110.1371/journal.pone.0232891Neural networks for open and closed Literature-based Discovery.Gamal CrichtonSimon BakerYufan GuoAnna KorhonenLiterature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD.https://doi.org/10.1371/journal.pone.0232891
spellingShingle Gamal Crichton
Simon Baker
Yufan Guo
Anna Korhonen
Neural networks for open and closed Literature-based Discovery.
PLoS ONE
title Neural networks for open and closed Literature-based Discovery.
title_full Neural networks for open and closed Literature-based Discovery.
title_fullStr Neural networks for open and closed Literature-based Discovery.
title_full_unstemmed Neural networks for open and closed Literature-based Discovery.
title_short Neural networks for open and closed Literature-based Discovery.
title_sort neural networks for open and closed literature based discovery
url https://doi.org/10.1371/journal.pone.0232891
work_keys_str_mv AT gamalcrichton neuralnetworksforopenandclosedliteraturebaseddiscovery
AT simonbaker neuralnetworksforopenandclosedliteraturebaseddiscovery
AT yufanguo neuralnetworksforopenandclosedliteraturebaseddiscovery
AT annakorhonen neuralnetworksforopenandclosedliteraturebaseddiscovery