Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management

Abstract A cognitive radar (CR) system can offer enhanced target tracking performance due to its intelligence on the perception‐action cycle, wherein a CR adaptively allocates the limited transmitting resources based on its perception of surrounding environments. To effectively manage the transmit w...

Full description

Bibliographic Details
Main Authors:	Jiahao Qin, Mengtao Zhu, Zesi Pan, Yunjie Li, Yan Li
Format:	Article
Language:	English
Published:	Wiley 2023-12-01
Series:	IET Radar, Sonar & Navigation
Subjects:	adaptive radar decision making intelligent networks
Online Access:	https://doi.org/10.1049/rsn2.12469

_version_	1827587751338311680
author	Jiahao Qin Mengtao Zhu Zesi Pan Yunjie Li Yan Li
author_facet	Jiahao Qin Mengtao Zhu Zesi Pan Yunjie Li Yan Li
author_sort	Jiahao Qin
collection	DOAJ
description	Abstract A cognitive radar (CR) system can offer enhanced target tracking performance due to its intelligence on the perception‐action cycle, wherein a CR adaptively allocates the limited transmitting resources based on its perception of surrounding environments. To effectively manage the transmit waveform resource for the target tracking task, CR resource management problem is formulated under the partially observable Markov decision process framework. The sequential decision‐making and the inherent partial observability for target tracking problem are considered. In the proposed method, a long short‐term memory (LSTM)‐based twin delayed deep deterministic policy gradient (TD3) algorithm is developed to effectively solve the problem. A reward function is designed considering Haykin's cognitive executive attention mechanism for radar systems such that the CR resource management policy has stability in the decision of transmit waveform, which follows the principle of minimum disturbance. Simulation results demonstrate the superiority of the proposed LSTM memory‐based TD3 with improved target tracking performance and increased mean rewards for CR.
first_indexed	2024-03-09T00:23:28Z
format	Article
id	doaj.art-94948ff6faf8416595d1712b860ba445
institution	Directory Open Access Journal
issn	1751-8784 1751-8792
language	English
last_indexed	2024-03-09T00:23:28Z
publishDate	2023-12-01
publisher	Wiley
record_format	Article
series	IET Radar, Sonar & Navigation
spelling	doaj.art-94948ff6faf8416595d1712b860ba4452023-12-12T05:20:22ZengWileyIET Radar, Sonar & Navigation1751-87841751-87922023-12-0117121822183610.1049/rsn2.12469Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource managementJiahao Qin0Mengtao Zhu1Zesi Pan2Yunjie Li3Yan Li4School of Cyberspace Science and Technology Beijing Institute of Technology Beijing ChinaSchool of Cyberspace Science and Technology Beijing Institute of Technology Beijing ChinaSchool of Information and Electronics Beijing Institute of Technology Beijing ChinaLaboratory of Electromagnetic Apace Cognition and Intelligent Control Beijing ChinaSchool of Cyberspace Science and Technology Beijing Institute of Technology Beijing ChinaAbstract A cognitive radar (CR) system can offer enhanced target tracking performance due to its intelligence on the perception‐action cycle, wherein a CR adaptively allocates the limited transmitting resources based on its perception of surrounding environments. To effectively manage the transmit waveform resource for the target tracking task, CR resource management problem is formulated under the partially observable Markov decision process framework. The sequential decision‐making and the inherent partial observability for target tracking problem are considered. In the proposed method, a long short‐term memory (LSTM)‐based twin delayed deep deterministic policy gradient (TD3) algorithm is developed to effectively solve the problem. A reward function is designed considering Haykin's cognitive executive attention mechanism for radar systems such that the CR resource management policy has stability in the decision of transmit waveform, which follows the principle of minimum disturbance. Simulation results demonstrate the superiority of the proposed LSTM memory‐based TD3 with improved target tracking performance and increased mean rewards for CR.https://doi.org/10.1049/rsn2.12469adaptive radardecision makingintelligent networks
spellingShingle	Jiahao Qin Mengtao Zhu Zesi Pan Yunjie Li Yan Li Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management IET Radar, Sonar & Navigation adaptive radar decision making intelligent networks
title	Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management
title_full	Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management
title_fullStr	Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management
title_full_unstemmed	Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management
title_short	Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management
title_sort	memory based deep reinforcement learning for cognitive radar target tracking waveform resource management
topic	adaptive radar decision making intelligent networks
url	https://doi.org/10.1049/rsn2.12469
work_keys_str_mv	AT jiahaoqin memorybaseddeepreinforcementlearningforcognitiveradartargettrackingwaveformresourcemanagement AT mengtaozhu memorybaseddeepreinforcementlearningforcognitiveradartargettrackingwaveformresourcemanagement AT zesipan memorybaseddeepreinforcementlearningforcognitiveradartargettrackingwaveformresourcemanagement AT yunjieli memorybaseddeepreinforcementlearningforcognitiveradartargettrackingwaveformresourcemanagement AT yanli memorybaseddeepreinforcementlearningforcognitiveradartargettrackingwaveformresourcemanagement

Memory‐based deep reinforcement learning for cognitive radar target tracking waveform resource management

Similar Items