Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences
Crossref DOI link: https://doi.org/10.1177/02783649211041652
Published Online: 2021-08-28
Published Print: 2022-01
Update policy: https://doi.org/10.1177/sage-journals-update-policy
Bıyık, Erdem http://orcid.org/0000-0002-9516-3130
Losey, Dylan P.
Palan, Malayandi
Landolfi, Nicholas C.
Shevchuk, Gleb
Sadigh, Dorsa
Funding for this research was provided by:
FLI (RFP2-000)
toyota research institute
NSF (#1849952)
Text and Data Mining valid from 2021-08-28