Multi-agent robust policy evaluation for reinforcement learning via primal-dual online time-averaging
Crossref DOI link: https://doi.org/10.1007/s11432-024-4578-2
Published Online: 2025-10-23
Published Print: 2025-12
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Chen, Gang
Pu, Changli
Zhou, Yaoyao
Li, Xiumin
Chen, Huimiao
Text and Data Mining valid from 2025-10-23
Version of Record valid from 2025-10-23
Article History
Received: 16 May 2024
Revised: 26 August 2024
Accepted: 14 July 2025
First Online: 23 October 2025