Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning

A reinforcement learning-based approach is proposed to design the multi-impulse rendezvous trajectories in linear relative motions. For the relative motion in elliptical orbits, the relative state propagation is obtained directly from the state transition matrix. This rendezvous problem is construct...

Full description

Bibliographic Details
Main Authors:	Longwei Xu, Gang Zhang, Shi Qiu, Xibin Cao
Format:	Article
Language:	English
Published:	American Association for the Advancement of Science (AAAS) 2023-01-01
Series:	Space: Science & Technology
Online Access:	https://spj.science.org/doi/10.34133/space.0047

_version_	1797785115400601600
author	Longwei Xu Gang Zhang Shi Qiu Xibin Cao
author_facet	Longwei Xu Gang Zhang Shi Qiu Xibin Cao
author_sort	Longwei Xu
collection	DOAJ
description	A reinforcement learning-based approach is proposed to design the multi-impulse rendezvous trajectories in linear relative motions. For the relative motion in elliptical orbits, the relative state propagation is obtained directly from the state transition matrix. This rendezvous problem is constructed as a Markov decision process that reflects the fuel consumption, the transfer time, the relative state, and the dynamical model. An actor–critic algorithm is used to train policy for generating rendezvous maneuvers. The results of the numerical optimization (e.g., differential evolution) are adopted as the expert data set to accelerate the training process. By deploying a policy network, the multi-impulse rendezvous trajectories can be obtained on board. Moreover, the proposed approach is also applied to generate a feasible solution for many impulses (e.g., 20 impulses), which can be used as an initial value for further optimization. The numerical examples with random initial states show that the proposed method is much faster and has slightly worse performance indexes when compared with the evolutionary algorithm.
first_indexed	2024-03-13T00:50:23Z
format	Article
id	doaj.art-8112840f4ed74c7ba4b5bf672c949b5d
institution	Directory Open Access Journal
issn	2692-7659
language	English
last_indexed	2024-03-13T00:50:23Z
publishDate	2023-01-01
publisher	American Association for the Advancement of Science (AAAS)
record_format	Article
series	Space: Science & Technology
spelling	doaj.art-8112840f4ed74c7ba4b5bf672c949b5d2023-07-07T21:18:56ZengAmerican Association for the Advancement of Science (AAAS)Space: Science & Technology2692-76592023-01-01310.34133/space.0047Optimal Multi-impulse Linear Rendezvous via Reinforcement LearningLongwei Xu0Gang Zhang1Shi Qiu2Xibin Cao3Research Center of Satellite Technology, Harbin Institute of Technology, Harbin 150001, PR China.Research Center of Satellite Technology, Harbin Institute of Technology, Harbin 150001, PR China.Research Center of Satellite Technology, Harbin Institute of Technology, Harbin 150001, PR China.Research Center of Satellite Technology, Harbin Institute of Technology, Harbin 150001, PR China.A reinforcement learning-based approach is proposed to design the multi-impulse rendezvous trajectories in linear relative motions. For the relative motion in elliptical orbits, the relative state propagation is obtained directly from the state transition matrix. This rendezvous problem is constructed as a Markov decision process that reflects the fuel consumption, the transfer time, the relative state, and the dynamical model. An actor–critic algorithm is used to train policy for generating rendezvous maneuvers. The results of the numerical optimization (e.g., differential evolution) are adopted as the expert data set to accelerate the training process. By deploying a policy network, the multi-impulse rendezvous trajectories can be obtained on board. Moreover, the proposed approach is also applied to generate a feasible solution for many impulses (e.g., 20 impulses), which can be used as an initial value for further optimization. The numerical examples with random initial states show that the proposed method is much faster and has slightly worse performance indexes when compared with the evolutionary algorithm.https://spj.science.org/doi/10.34133/space.0047
spellingShingle	Longwei Xu Gang Zhang Shi Qiu Xibin Cao Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning Space: Science & Technology
title	Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning
title_full	Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning
title_fullStr	Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning
title_full_unstemmed	Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning
title_short	Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning
title_sort	optimal multi impulse linear rendezvous via reinforcement learning
url	https://spj.science.org/doi/10.34133/space.0047
work_keys_str_mv	AT longweixu optimalmultiimpulselinearrendezvousviareinforcementlearning AT gangzhang optimalmultiimpulselinearrendezvousviareinforcementlearning AT shiqiu optimalmultiimpulselinearrendezvousviareinforcementlearning AT xibincao optimalmultiimpulselinearrendezvousviareinforcementlearning

Optimal Multi-impulse Linear Rendezvous via Reinforcement Learning

Similar Items