Selective UMLS knowledge infusion for biomedical question answering

Abstract One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeti...

Full description

Bibliographic Details
Main Authors:	Hyeryun Park, Jiye Son, Jeongwon Min, Jinwook Choi
Format:	Article
Language:	English
Published:	Nature Portfolio 2023-08-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-023-41423-8

_version_	1797452596552663040
author	Hyeryun Park Jiye Son Jeongwon Min Jinwook Choi
author_facet	Hyeryun Park Jiye Son Jeongwon Min Jinwook Choi
author_sort	Hyeryun Park
collection	DOAJ
description	Abstract One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeting biomedical question-answering. Transferring all semantics of a large knowledge graph into the entire model requires too many parameters, increasing computational cost and time. We investigate an efficient approach that leverages adapters to inject Unified Medical Language System knowledge into pretrained language models, and we question the need to use all semantics in the knowledge graph. This study focuses on strategies of partitioning knowledge graph and either discarding or merging some for more efficient pretraining. According to the results of three biomedical question answering finetuning datasets, the adapters pretrained on semantically partitioned group showed more efficient performance in terms of evaluation metrics, required parameters, and time. The results also show that discarding groups with fewer concepts is a better direction for small datasets, and merging these groups is better for large dataset. Furthermore, the metric results show a slight improvement, demonstrating that the adapter methodology is rather insensitive to the group formulation.
first_indexed	2024-03-09T15:10:56Z
format	Article
id	doaj.art-91bba9a055134c479c141a97dfcdb55b
institution	Directory Open Access Journal
issn	2045-2322
language	English
last_indexed	2024-03-09T15:10:56Z
publishDate	2023-08-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj.art-91bba9a055134c479c141a97dfcdb55b2023-11-26T13:23:06ZengNature PortfolioScientific Reports2045-23222023-08-011311910.1038/s41598-023-41423-8Selective UMLS knowledge infusion for biomedical question answeringHyeryun Park0Jiye Son1Jeongwon Min2Jinwook Choi3Interdisciplinary Program for Bioengineering, Seoul National University Graduate SchoolInterdisciplinary Program for Bioengineering, Seoul National University Graduate SchoolInterdisciplinary Program for Bioengineering, Seoul National University Graduate SchoolIntegrated Major in Innovative Medical Science, Seoul National University Graduate SchoolAbstract One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeting biomedical question-answering. Transferring all semantics of a large knowledge graph into the entire model requires too many parameters, increasing computational cost and time. We investigate an efficient approach that leverages adapters to inject Unified Medical Language System knowledge into pretrained language models, and we question the need to use all semantics in the knowledge graph. This study focuses on strategies of partitioning knowledge graph and either discarding or merging some for more efficient pretraining. According to the results of three biomedical question answering finetuning datasets, the adapters pretrained on semantically partitioned group showed more efficient performance in terms of evaluation metrics, required parameters, and time. The results also show that discarding groups with fewer concepts is a better direction for small datasets, and merging these groups is better for large dataset. Furthermore, the metric results show a slight improvement, demonstrating that the adapter methodology is rather insensitive to the group formulation.https://doi.org/10.1038/s41598-023-41423-8
spellingShingle	Hyeryun Park Jiye Son Jeongwon Min Jinwook Choi Selective UMLS knowledge infusion for biomedical question answering Scientific Reports
title	Selective UMLS knowledge infusion for biomedical question answering
title_full	Selective UMLS knowledge infusion for biomedical question answering
title_fullStr	Selective UMLS knowledge infusion for biomedical question answering
title_full_unstemmed	Selective UMLS knowledge infusion for biomedical question answering
title_short	Selective UMLS knowledge infusion for biomedical question answering
title_sort	selective umls knowledge infusion for biomedical question answering
url	https://doi.org/10.1038/s41598-023-41423-8
work_keys_str_mv	AT hyeryunpark selectiveumlsknowledgeinfusionforbiomedicalquestionanswering AT jiyeson selectiveumlsknowledgeinfusionforbiomedicalquestionanswering AT jeongwonmin selectiveumlsknowledgeinfusionforbiomedicalquestionanswering AT jinwookchoi selectiveumlsknowledgeinfusionforbiomedicalquestionanswering

Selective UMLS knowledge infusion for biomedical question answering

Similar Items