Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures
The possibilities of increasing the readiness of a redundant computer system for the timely execution of requests critical to service delays are being investigated. A fault-tolerant computer cluster is considered in which nodes are duplicated computing systems that combine computer nodes and memor...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)
2023-06-01
|
Series: | Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki |
Subjects: | |
Online Access: | https://ntv.ifmo.ru/file/article/22070.pdf |
_version_ | 1797798445976649728 |
---|---|
author | Vladimir A. Bogatyrev Stanislav V. Bogatyrev Anatoly V. Bogatyrev |
author_facet | Vladimir A. Bogatyrev Stanislav V. Bogatyrev Anatoly V. Bogatyrev |
author_sort | Vladimir A. Bogatyrev |
collection | DOAJ |
description | The possibilities of increasing the readiness of a redundant computer system for the timely execution of requests critical
to service delays are being investigated. A fault-tolerant computer cluster is considered in which nodes are duplicated
computing systems that combine computer nodes and memory nodes. Two-stage recovery of memory nodes is assumed:
first physical, and then informational, carried out using the resources of computing nodes. The novelty of the approach
lies in the fact that for systems with a limitation of the allowable service time of functional requests, the impact of
recovery disciplines on the readiness of the system with various options for dividing computing resources to restore
information after memory failures and to perform the required functions is evaluated. At the same time, the reliability of
the computer systems under study is assessed not only by the probability of their readiness to perform functional tasks
(by the readiness coefficient), but also by the probability of the system readiness to perform tasks in a timely manner.
Justification of the choice of disciplines for the restoration and maintenance of the flow of functional requests is carried
out on the basis of Markov models. At the same time, models are proposed that allow taking into account the impact of
the division of computing resources on the joint performance of the required functions and on the information recovery
of memory, implemented after its physical recovery. The choice of computer system maintenance disciplines based on
the proposed Markov model is aimed at achieving a compromise between the desire to increase the availability factor
and the probability of timely execution of the incoming flow of functional requests. The justification of the choice of
options for the distribution (separation) of computing resources stored after failures to solve functional queries (required
functions) and information recovery of memory, implemented after its physical recovery, is carried out. Based on the
proposed Markov models, the dependence of the system readiness for timely execution of requests on the distribution
options of computing resources stored in the system for restoring information in memory and for performing functional
tasks is investigated. The study was conducted depending on the allowable waiting time for functional requests and
the intensity of their traffic. The influence on the system readiness for timely execution of traffic balancing requests of
functional tasks between functional computing nodes is analyzed, taking into account the options for their possible joint
use for information recovery of memory nodes after their physical recovery. The existence of an optimal share of traffic
distribution between computing nodes is shown, taking into account the options for dividing their resources to service
functional requests and to restore information in memory nodes after their physical recovery. The results obtained can
be used to justify the choice of disciplines for servicing functional requests and recovery after failures of fault-tolerant
cluster systems critical to delays in the execution of functional requests. |
first_indexed | 2024-03-13T04:03:49Z |
format | Article |
id | doaj.art-0a6eca79dec846a5aba13f5cef9e678f |
institution | Directory Open Access Journal |
issn | 2226-1494 2500-0373 |
language | English |
last_indexed | 2024-03-13T04:03:49Z |
publishDate | 2023-06-01 |
publisher | Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University) |
record_format | Article |
series | Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki |
spelling | doaj.art-0a6eca79dec846a5aba13f5cef9e678f2023-06-21T09:42:25ZengSaint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University)Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki2226-14942500-03732023-06-0123360861710.17586/2226-1494-2023-23-3-608-617Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failuresVladimir A. Bogatyrev0https://orcid.org/0000-0003-0213-0223Stanislav V. Bogatyrev1https://orcid.org/0000-0003-0836-8515Anatoly V. Bogatyrev2https://orcid.org/0000-0001-5447-7275D.Sc., Professor, ITMO University, Saint Petersburg, 197101, Russian Federation, 197101, Russian Federation; Professor, Saint Petersburg State University of Aerospace Instrumentation, 190000, Russian Federation, sc 7006571069PhD Student, ITMO University, Saint Petersburg, 197101, Russian Federation; Consulting Engineer, Yadro Cloud Storage Development Center, Saint Petersburg, 195027, Russian Federation, sc 57183002200PhD, Consulting Engineer, Yadro Cloud Storage Development Center, Saint Petersburg, 195027, Russian Federation, sc 56549712700The possibilities of increasing the readiness of a redundant computer system for the timely execution of requests critical to service delays are being investigated. A fault-tolerant computer cluster is considered in which nodes are duplicated computing systems that combine computer nodes and memory nodes. Two-stage recovery of memory nodes is assumed: first physical, and then informational, carried out using the resources of computing nodes. The novelty of the approach lies in the fact that for systems with a limitation of the allowable service time of functional requests, the impact of recovery disciplines on the readiness of the system with various options for dividing computing resources to restore information after memory failures and to perform the required functions is evaluated. At the same time, the reliability of the computer systems under study is assessed not only by the probability of their readiness to perform functional tasks (by the readiness coefficient), but also by the probability of the system readiness to perform tasks in a timely manner. Justification of the choice of disciplines for the restoration and maintenance of the flow of functional requests is carried out on the basis of Markov models. At the same time, models are proposed that allow taking into account the impact of the division of computing resources on the joint performance of the required functions and on the information recovery of memory, implemented after its physical recovery. The choice of computer system maintenance disciplines based on the proposed Markov model is aimed at achieving a compromise between the desire to increase the availability factor and the probability of timely execution of the incoming flow of functional requests. The justification of the choice of options for the distribution (separation) of computing resources stored after failures to solve functional queries (required functions) and information recovery of memory, implemented after its physical recovery, is carried out. Based on the proposed Markov models, the dependence of the system readiness for timely execution of requests on the distribution options of computing resources stored in the system for restoring information in memory and for performing functional tasks is investigated. The study was conducted depending on the allowable waiting time for functional requests and the intensity of their traffic. The influence on the system readiness for timely execution of traffic balancing requests of functional tasks between functional computing nodes is analyzed, taking into account the options for their possible joint use for information recovery of memory nodes after their physical recovery. The existence of an optimal share of traffic distribution between computing nodes is shown, taking into account the options for dividing their resources to service functional requests and to restore information in memory nodes after their physical recovery. The results obtained can be used to justify the choice of disciplines for servicing functional requests and recovery after failures of fault-tolerant cluster systems critical to delays in the execution of functional requests.https://ntv.ifmo.ru/file/article/22070.pdfclusteravailability factorrecoveryinformation recovery of memorymarkov modelrecovery disciplinecriticality to service delaysprobability of timely execution of requestsduplicated systemfault tolerance |
spellingShingle | Vladimir A. Bogatyrev Stanislav V. Bogatyrev Anatoly V. Bogatyrev Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki cluster availability factor recovery information recovery of memory markov model recovery discipline criticality to service delays probability of timely execution of requests duplicated system fault tolerance |
title | Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
title_full | Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
title_fullStr | Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
title_full_unstemmed | Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
title_short | Assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
title_sort | assessment of the readiness of a computer system for timely servicing of requests when combined with information recovery of memory after failures |
topic | cluster availability factor recovery information recovery of memory markov model recovery discipline criticality to service delays probability of timely execution of requests duplicated system fault tolerance |
url | https://ntv.ifmo.ru/file/article/22070.pdf |
work_keys_str_mv | AT vladimirabogatyrev assessmentofthereadinessofacomputersystemfortimelyservicingofrequestswhencombinedwithinformationrecoveryofmemoryafterfailures AT stanislavvbogatyrev assessmentofthereadinessofacomputersystemfortimelyservicingofrequestswhencombinedwithinformationrecoveryofmemoryafterfailures AT anatolyvbogatyrev assessmentofthereadinessofacomputersystemfortimelyservicingofrequestswhencombinedwithinformationrecoveryofmemoryafterfailures |