Application of machine learning methods for filling and updating nuclear knowledge bases

The paper deals with issues of designing and creating knowledge bases in the field of nuclear science and technology. The authors present the results of searching for and testing optimal classification and semantic annotation algorithms applied to the textual network content for the convenience of c...

Full description

Bibliographic Details
Main Authors: Victor P. Telnov, Yury A. Korovin
Format: Article
Language:English
Published: National Research Nuclear University (MEPhI) 2023-06-01
Series:Nuclear Energy and Technology
Subjects:
Online Access:https://nucet.pensoft.net/article/106759/download/pdf/
Description
Summary:The paper deals with issues of designing and creating knowledge bases in the field of nuclear science and technology. The authors present the results of searching for and testing optimal classification and semantic annotation algorithms applied to the textual network content for the convenience of computer-aided filling and updating of scalable semantic repositories (knowledge bases) in the field of nuclear physics and nuclear power engineering and, in the future, for other subject areas, both in Russian and English. The proposed algorithms will provide a methodological and technological basis for creating problem-oriented knowledge bases as artificial intelligence systems, as well as prerequisites for the development of semantic technologies for acquiring new knowledge on the Internet without direct human participation. Testing of the studied machine learning algorithms is carried out by the cross-validation method using corpora of specialized texts. The novelty of the presented study lies in the application of the Pareto optimality principle for multi-criteria evaluation and ranking of the studied algorithms in the absence of a priori information about the comparative significance of the criteria. The project is implemented in accordance with the Semantic Web standards (RDF, OWL, SPARQL, etc.). There are no technological restrictions for integrating the created knowledge bases with third-party data repositories as well as metasearch, library, reference or information and question-answer systems. The proposed software solutions are based on cloud computing using DBaaS and PaaS service models to ensure the scalability of data warehouses and network services. The created software is in the public domain and can be freely replicated.
ISSN:2452-3038