Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling
Crossref DOI link: https://doi.org/10.1007/s10994-020-05912-5
Published Online: 2021-01-04
Published Print: 2021-03
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Prashanth, L. A. https://orcid.org/0000-0003-0362-6730
Korda, Nathaniel
Munos, Rémi
Text and Data Mining valid from 2021-01-04
Version of Record valid from 2021-01-04
Article History
Received: 4 August 2014
Revised: 28 January 2020
Accepted: 8 September 2020
First Online: 4 January 2021
Free to read: This content has been made available to all.