Light Water Reactor Loading Pattern Optimization with Reinforcement Learning Algorithms
In 2023, Commercial Nuclear Power Plants (NPPs) in the USA, comprising Light Water Reactors (LWRs) such as Pressurized Water Reactors (PWRs) and Boiling Water Reactors (BWRs), remained the largest single source of carbon-free energy. They provided approximately half of the nation’s carbon-free elect...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/157146 https://orcid.org/0000-0002-5940-7695 |
Summary: | In 2023, Commercial Nuclear Power Plants (NPPs) in the USA, comprising Light Water Reactors (LWRs) such as Pressurized Water Reactors (PWRs) and Boiling Water Reactors (BWRs), remained the largest single source of carbon-free energy. They provided approximately half of the nation’s carbon-free electricity and under 20% of total electricity throughout the year. Ensuring the competitiveness of these nuclear assets is crucial for maintaining their role in providing dispatchable clean energy alongside renewable sources. The recent commissioning of Vogtle Units 3 and 4 marked the first new NPPs connected to the grid in over three decades, highlighting the high costs associated with nuclear technology and underscoring the need to improve their economic competitiveness. Optimizing the fuel cycle economics through enhanced core Loading Pattern (LP) is a key strategy to address this challenge. Since the 1960s, optimizing the LP for LWRs has been a major focus in nuclear engineering, but the large search space has posed significant difficulties. Computational methods from Stochastic Optimization (SO) have been used to tackle this issue, yet they often fail to outperform expert-designed solutions preferred by utilities. Deep Reinforcement Learning (RL), a subset of Deep Learning focused on decision-making, has shown promise in surpassing human-expert solutions in fields such as gaming and robotics. This thesis investigates the use of RL to improve automated tools for solving the PWR LP optimization problem, with the goal of developing efficient decision-support tools for core designers to generate more economical loading patterns. We present a novel approach using deep RL to solve the LP problem and compare it with traditional SO-based methods. Our findings indicate that the LP problem benefits from a global search to rapidly identify promising directions, followed by a local search to efficiently exploit these directions and avoid local optima. Proximal Policy Optimization (PPO), a type of RL algorithm, adapts its search capabilities with learnable policy weights, making it effective for both global and local searches, which contributes to its superiority over SO-based methods. Additionally, we introduce a new method called PEARL (Pareto Envelope Augmented with 3 Reinforcement Learning) to tackle multi-objective optimization challenges. PEARL demonstrates greater efficiency in identifying Pareto fronts without requiring additional designer intervention, compared to traditional single-objective scaling methods. Finally, we extend PEARL to a novel paradigm called physics-informed RL by integrating statistical techniques and physics knowledge to enhance algorithm performance. As problem complexity increases, classical methods sometimes fail to find feasible solutions. Incorporating physics-informed insights becomes crucial for discovering high-quality and diverse solutions more efficiently. These results highlight the potential of AI advancements in the nuclear field. A deep understanding of AI tools is essential to fully leverage their capabilities. Our approach achieved a cumulative benefit of over 4 $million per year per plant compared to using off-the-shelf AI solutions. While further work is needed to translate these theoretical benefits into real reactors, these algorithms promise to enhance the competitiveness of future nuclear fleets. In doing so, they could make a substantial contribution to achieving carbon neutrality by increasing the amount of clean electricity on the grid. |
---|