Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)

This article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the be...

Full description

Bibliographic Details
Main Authors: Nikolay A. Kolpakov, Alexey I. Molodchenkov, Anton V. Lukin
Format: Article
Language:English
Published: Peoples’ Friendship University of Russia (RUDN University) 2023-03-01
Series:Discrete and Continuous Models and Applied Computational Science
Subjects:
Online Access:https://journals.rudn.ru/miph/article/viewFile/34463/22068
_version_ 1827960654533754880
author Nikolay A. Kolpakov
Alexey I. Molodchenkov
Anton V. Lukin
author_facet Nikolay A. Kolpakov
Alexey I. Molodchenkov
Anton V. Lukin
author_sort Nikolay A. Kolpakov
collection DOAJ
description This article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the best extraction quality was achieved by a model based on BioBERT.
first_indexed 2024-04-09T16:12:23Z
format Article
id doaj.art-766bba0920904416b3783f939fa3fe6c
institution Directory Open Access Journal
issn 2658-4670
2658-7149
language English
last_indexed 2024-04-09T16:12:23Z
publishDate 2023-03-01
publisher Peoples’ Friendship University of Russia (RUDN University)
record_format Article
series Discrete and Continuous Models and Applied Computational Science
spelling doaj.art-766bba0920904416b3783f939fa3fe6c2023-04-24T09:00:21ZengPeoples’ Friendship University of Russia (RUDN University)Discrete and Continuous Models and Applied Computational Science2658-46702658-71492023-03-01311647410.22363/2658-4670-2023-31-1-64-7421010Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)Nikolay A. Kolpakov0https://orcid.org/0000-0002-1640-1357Alexey I. Molodchenkov1https://orcid.org/0000-0003-0039-943XAnton V. Lukin2https://orcid.org/0000-0003-4391-1958Moscow Institute of Physics and Technology (MIPT)Federal research center “Computer science and control” of RASFederal research center “Computer science and control” of RASThis article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the best extraction quality was achieved by a model based on BioBERT.https://journals.rudn.ru/miph/article/viewFile/34463/22068machine learningnatural language processingnamed entity recognitionbiomedical texts processing
spellingShingle Nikolay A. Kolpakov
Alexey I. Molodchenkov
Anton V. Lukin
Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
Discrete and Continuous Models and Applied Computational Science
machine learning
natural language processing
named entity recognition
biomedical texts processing
title Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
title_full Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
title_fullStr Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
title_full_unstemmed Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
title_short Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
title_sort methods of extracting biomedical information from patents and scientific publications on the example of chemical compounds
topic machine learning
natural language processing
named entity recognition
biomedical texts processing
url https://journals.rudn.ru/miph/article/viewFile/34463/22068
work_keys_str_mv AT nikolayakolpakov methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds
AT alexeyimolodchenkov methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds
AT antonvlukin methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds