Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper

With the advancement of technology scaling, multi/many-core platforms are getting more attention in embedded systems due to the ever-increasing performance requirements and power efficiency. This feature size scaling, along with architectural innovations, has dramatically exacerbated the rate of man...

Full description

Bibliographic Details
Main Authors: Siva Satyendra Sahoo, Behnaz Ranjbar, Akash Kumar
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Journal of Low Power Electronics and Applications
Subjects:
Online Access:https://www.mdpi.com/2079-9268/11/1/7
_version_ 1797407685921996800
author Siva Satyendra Sahoo
Behnaz Ranjbar
Akash Kumar
author_facet Siva Satyendra Sahoo
Behnaz Ranjbar
Akash Kumar
author_sort Siva Satyendra Sahoo
collection DOAJ
description With the advancement of technology scaling, multi/many-core platforms are getting more attention in embedded systems due to the ever-increasing performance requirements and power efficiency. This feature size scaling, along with architectural innovations, has dramatically exacerbated the rate of manufacturing defects and physical fault-rates. As a result, in addition to providing high parallelism, such hardware platforms have introduced increasing unreliability into the system. Such systems need to be well designed to ensure long-term and application-specific reliability, especially in mixed-criticality systems, where incorrect execution of applications may cause catastrophic consequences. However, the optimal allocation of applications/tasks on multi/many-core platforms is an increasingly complex problem. Therefore, reliability-aware resource management is crucial while ensuring the application-specific Quality-of-Service (QoS) requirements and optimizing other system-level performance goals. This article presents a survey of recent works that focus on reliability-aware resource management in multi-/many-core systems. We first present an overview of reliability in electronic systems, associated fault models and the various system models used in related research. Then, we present recent published articles primarily focusing on aspects such as application-specific reliability optimization, mixed-criticality awareness, and hardware resource heterogeneity. To underscore the techniques’ differences, we classify them based on the design space exploration. In the end, we briefly discuss the upcoming trends and open challenges within the domain of reliability-aware resource management for future research.
first_indexed 2024-03-09T03:46:11Z
format Article
id doaj.art-257ccb61b71042ebaec4532c87df86f3
institution Directory Open Access Journal
issn 2079-9268
language English
last_indexed 2024-03-09T03:46:11Z
publishDate 2021-01-01
publisher MDPI AG
record_format Article
series Journal of Low Power Electronics and Applications
spelling doaj.art-257ccb61b71042ebaec4532c87df86f32023-12-03T14:34:20ZengMDPI AGJournal of Low Power Electronics and Applications2079-92682021-01-01111710.3390/jlpea11010007Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective PaperSiva Satyendra Sahoo0Behnaz Ranjbar1Akash Kumar2CFAED, Technische Universität Dresden, 01062 Dresden, GermanyCFAED, Technische Universität Dresden, 01062 Dresden, GermanyCFAED, Technische Universität Dresden, 01062 Dresden, GermanyWith the advancement of technology scaling, multi/many-core platforms are getting more attention in embedded systems due to the ever-increasing performance requirements and power efficiency. This feature size scaling, along with architectural innovations, has dramatically exacerbated the rate of manufacturing defects and physical fault-rates. As a result, in addition to providing high parallelism, such hardware platforms have introduced increasing unreliability into the system. Such systems need to be well designed to ensure long-term and application-specific reliability, especially in mixed-criticality systems, where incorrect execution of applications may cause catastrophic consequences. However, the optimal allocation of applications/tasks on multi/many-core platforms is an increasingly complex problem. Therefore, reliability-aware resource management is crucial while ensuring the application-specific Quality-of-Service (QoS) requirements and optimizing other system-level performance goals. This article presents a survey of recent works that focus on reliability-aware resource management in multi-/many-core systems. We first present an overview of reliability in electronic systems, associated fault models and the various system models used in related research. Then, we present recent published articles primarily focusing on aspects such as application-specific reliability optimization, mixed-criticality awareness, and hardware resource heterogeneity. To underscore the techniques’ differences, we classify them based on the design space exploration. In the end, we briefly discuss the upcoming trends and open challenges within the domain of reliability-aware resource management for future research.https://www.mdpi.com/2079-9268/11/1/7multi/many-core platformsreliabilityresource managementmixed-criticality
spellingShingle Siva Satyendra Sahoo
Behnaz Ranjbar
Akash Kumar
Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
Journal of Low Power Electronics and Applications
multi/many-core platforms
reliability
resource management
mixed-criticality
title Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
title_full Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
title_fullStr Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
title_full_unstemmed Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
title_short Reliability-Aware Resource Management in Multi-/Many-Core Systems: A Perspective Paper
title_sort reliability aware resource management in multi many core systems a perspective paper
topic multi/many-core platforms
reliability
resource management
mixed-criticality
url https://www.mdpi.com/2079-9268/11/1/7
work_keys_str_mv AT sivasatyendrasahoo reliabilityawareresourcemanagementinmultimanycoresystemsaperspectivepaper
AT behnazranjbar reliabilityawareresourcemanagementinmultimanycoresystemsaperspectivepaper
AT akashkumar reliabilityawareresourcemanagementinmultimanycoresystemsaperspectivepaper