Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure

The method for ensuring availability in an existing cloud environment is primarily a metric-based fault detection method. However, the existing fault detection method makes it difficult to configure the environment as the cloud size increases and becomes more complex, and it is necessary to accurate...

Full description

Bibliographic Details
Main Authors: Hyunsik Yang, Younghan Kim
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/11/22/3765
_version_ 1797465536987136000
author Hyunsik Yang
Younghan Kim
author_facet Hyunsik Yang
Younghan Kim
author_sort Hyunsik Yang
collection DOAJ
description The method for ensuring availability in an existing cloud environment is primarily a metric-based fault detection method. However, the existing fault detection method makes it difficult to configure the environment as the cloud size increases and becomes more complex, and it is necessary to accurately understand the metric in order to use the metric accurately. Furthermore, additional changes are required whenever the monitoring environment changes. In order to solve these problems, various fault detection and prediction methods based on machine learning have recently been proposed. The machine learning-based fault detection and recovery model most commonly proposed in the cloud environment is a supervised machine learning method that learns data relating to fault situations and, based on this data, detects faults. However, there is a limit to fault learning because it is difficult to obtain all of the fault situation data necessary to learn all of the fault situations that occur in a large-scale cloud environment. In addition, it is difficult to detect a fault when a fault that differs from the learned fault pattern occurs. Furthermore, it is necessary to discuss the automatic recovery architecture leading to the fault recovery procedure based on the fault detection results. Therefore, in this paper, we designed and implemented a whole system that predicts faults by detecting fault situations using the anomaly detection method.
first_indexed 2024-03-09T18:22:53Z
format Article
id doaj.art-d82d86bcbf1e45dab414318135e7f70d
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-09T18:22:53Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-d82d86bcbf1e45dab414318135e7f70d2023-11-24T08:10:10ZengMDPI AGElectronics2079-92922022-11-011122376510.3390/electronics11223765Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud InfrastructureHyunsik Yang0Younghan Kim1School of Electronic Engineering, Soongsil University, Seoul 06978, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul 06978, Republic of KoreaThe method for ensuring availability in an existing cloud environment is primarily a metric-based fault detection method. However, the existing fault detection method makes it difficult to configure the environment as the cloud size increases and becomes more complex, and it is necessary to accurately understand the metric in order to use the metric accurately. Furthermore, additional changes are required whenever the monitoring environment changes. In order to solve these problems, various fault detection and prediction methods based on machine learning have recently been proposed. The machine learning-based fault detection and recovery model most commonly proposed in the cloud environment is a supervised machine learning method that learns data relating to fault situations and, based on this data, detects faults. However, there is a limit to fault learning because it is difficult to obtain all of the fault situation data necessary to learn all of the fault situations that occur in a large-scale cloud environment. In addition, it is difficult to detect a fault when a fault that differs from the learned fault pattern occurs. Furthermore, it is necessary to discuss the automatic recovery architecture leading to the fault recovery procedure based on the fault detection results. Therefore, in this paper, we designed and implemented a whole system that predicts faults by detecting fault situations using the anomaly detection method.https://www.mdpi.com/2079-9292/11/22/3765cloudavailabilitymachine learningfault detectionanomaly detection
spellingShingle Hyunsik Yang
Younghan Kim
Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
Electronics
cloud
availability
machine learning
fault detection
anomaly detection
title Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
title_full Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
title_fullStr Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
title_full_unstemmed Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
title_short Design and Implementation of Machine Learning-Based Fault Prediction System in Cloud Infrastructure
title_sort design and implementation of machine learning based fault prediction system in cloud infrastructure
topic cloud
availability
machine learning
fault detection
anomaly detection
url https://www.mdpi.com/2079-9292/11/22/3765
work_keys_str_mv AT hyunsikyang designandimplementationofmachinelearningbasedfaultpredictionsystemincloudinfrastructure
AT younghankim designandimplementationofmachinelearningbasedfaultpredictionsystemincloudinfrastructure