Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems
We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a...
Main Authors: | , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
Society for Industrial and Applied Mathematics
2024
|