Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
Crossref DOI link: https://doi.org/10.1007/s11229-021-03141-4
Published Online: 2021-05-19
Published Print: 2021-11
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Everitt, Tom https://orcid.org/0000-0003-1210-9866
Hutter, Marcus
Kumar, Ramana
Krakovna, Victoria
Text and Data Mining valid from 2021-05-19
Version of Record valid from 2021-05-19
Article History
Received: 31 March 2018
Accepted: 26 March 2021
First Online: 19 May 2021