Thermal management for large-scale multi-core processors

Thermal management is a critical research topic for modern multi-core processors. The steady shrinking of feature sizes has resulted in high power density in semiconductor devices which severely increases the internal temperature of the devices during operation time. In order to prevent the multi-co...

Full description

Bibliographic Details
Main Author: Cui, Yingnan
Other Authors: Tang Xueyan
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/69037
Description
Summary:Thermal management is a critical research topic for modern multi-core processors. The steady shrinking of feature sizes has resulted in high power density in semiconductor devices which severely increases the internal temperature of the devices during operation time. In order to prevent the multi-core processors from malfunctioning or damage from high operating temperature, effective and efficient thermal management technologies are required. Traditional thermal management technologies can be divided into two major categories: scheduling based solutions and dynamic voltage and frequency scaling (DVFS) based solutions. As the semiconductor technology steps into deep sub-micron era, both kinds of solutions are facing with new challenges. Firstly, as the scale of multi-core processors keeps increasing, traditional centralized thermal management technologies show limited scalability and efficiency with large-scale multi-core processor systems. Secondly, process variation has become so intensive that the resulting variation in thermal and power model challenge the effectiveness of the model-based thermal management solutions. Finally, as microprocessors are digital systems, the scaling of frequency and voltage levels are preformed in discrete manners. However, most thermal management solutions are based on continuous thermal models, which can affect the control quality and increase overhead of thermal management techniques. This thesis addresses the above challenges in thermal management as follows. First, we propose a decentralized thermal-aware scheduling algorithm for large-scale multi-core processors. The algorithm divides the multi-core processors into individual clusters and uses a hierarchy of software agents to manage the mapping and scheduling of tasks in each cluster in a decentralized manner. The algorithm significantly improves the scalability while showing satisfactory quality in thermal management when compared with state-of-the-art centralized thermal-aware scheduling solutions. Second, we propose a variation-aware self-adaptive fuzzy thermal controller to dynamically adjust the voltage and frequency levels of multi-core processors to maintain the temperature of processors under a given threshold. The self-adaptive fuzzy controller does not rely on accurate thermal models for processors which are often impossible to build. In addition, it is able to offset the disturbance caused by process variations. Third, we propose a discrete thermal controller design to bridge the gap between the digital circuits behavior and the continuous temperature changing process. In this work, we analyse the continuous system response of a discrete circuits and use discrete controller design methodology to propose a discrete thermal controller. By avoiding signal distortion, the discrete controller outperforms previous design with a much lower sampling frequency which significantly reduces the overhead of the system. Overall, this thesis has made contributions to address the challenges in thermal management, and we demonstrate significant improvements in terms of effectiveness and efficiency in comparison with the state-of-the-art studies in thermal management.