Variance-constrained actor-critic algorithms for discounted and average reward MDPs
Crossref DOI link: https://doi.org/10.1007/s10994-016-5569-5
Published Online: 2016-08-05
Published Print: 2016-12
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Prashanth, L. A.
Ghavamzadeh, Mohammad
License valid from 2016-08-05