Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)
This article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the be...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Peoples’ Friendship University of Russia (RUDN University)
2023-03-01
|
Series: | Discrete and Continuous Models and Applied Computational Science |
Subjects: | |
Online Access: | https://journals.rudn.ru/miph/article/viewFile/34463/22068 |
_version_ | 1827960654533754880 |
---|---|
author | Nikolay A. Kolpakov Alexey I. Molodchenkov Anton V. Lukin |
author_facet | Nikolay A. Kolpakov Alexey I. Molodchenkov Anton V. Lukin |
author_sort | Nikolay A. Kolpakov |
collection | DOAJ |
description | This article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the best extraction quality was achieved by a model based on BioBERT. |
first_indexed | 2024-04-09T16:12:23Z |
format | Article |
id | doaj.art-766bba0920904416b3783f939fa3fe6c |
institution | Directory Open Access Journal |
issn | 2658-4670 2658-7149 |
language | English |
last_indexed | 2024-04-09T16:12:23Z |
publishDate | 2023-03-01 |
publisher | Peoples’ Friendship University of Russia (RUDN University) |
record_format | Article |
series | Discrete and Continuous Models and Applied Computational Science |
spelling | doaj.art-766bba0920904416b3783f939fa3fe6c2023-04-24T09:00:21ZengPeoples’ Friendship University of Russia (RUDN University)Discrete and Continuous Models and Applied Computational Science2658-46702658-71492023-03-01311647410.22363/2658-4670-2023-31-1-64-7421010Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds)Nikolay A. Kolpakov0https://orcid.org/0000-0002-1640-1357Alexey I. Molodchenkov1https://orcid.org/0000-0003-0039-943XAnton V. Lukin2https://orcid.org/0000-0003-4391-1958Moscow Institute of Physics and Technology (MIPT)Federal research center “Computer science and control” of RASFederal research center “Computer science and control” of RASThis article proposes an algorithm for solving the problem of extracting information from biomedical patents and scientific publications. The introduced algorithm is based on machine learning methods. Experiments were carried out on patents from the USPTO database. Experiments have shown that the best extraction quality was achieved by a model based on BioBERT.https://journals.rudn.ru/miph/article/viewFile/34463/22068machine learningnatural language processingnamed entity recognitionbiomedical texts processing |
spellingShingle | Nikolay A. Kolpakov Alexey I. Molodchenkov Anton V. Lukin Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) Discrete and Continuous Models and Applied Computational Science machine learning natural language processing named entity recognition biomedical texts processing |
title | Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) |
title_full | Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) |
title_fullStr | Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) |
title_full_unstemmed | Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) |
title_short | Methods of extracting biomedical information from patents and scientific publications (on the example of chemical compounds) |
title_sort | methods of extracting biomedical information from patents and scientific publications on the example of chemical compounds |
topic | machine learning natural language processing named entity recognition biomedical texts processing |
url | https://journals.rudn.ru/miph/article/viewFile/34463/22068 |
work_keys_str_mv | AT nikolayakolpakov methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds AT alexeyimolodchenkov methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds AT antonvlukin methodsofextractingbiomedicalinformationfrompatentsandscientificpublicationsontheexampleofchemicalcompounds |