Reinforcement online learning to rank with unbiased reward shaping
Crossref DOI link: https://doi.org/10.1007/s10791-022-09413-y
Published Online: 2022-08-04
Published Print: 2022-12
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Zhuang, Shengyao
Qiao, Zhihao
Zuccon, Guido http://orcid.org/0000-0003-0271-5563
Funding for this research was provided by:
The University of Queensland
Text and Data Mining valid from 2022-08-04
Version of Record valid from 2022-08-04
Article History
Received: 16 June 2021
Accepted: 30 May 2022
First Online: 4 August 2022