Nouveau chat
Historique de recherche
Recherche académiqueRecherche d'articlesBibliothèqueDiscussions récentes
Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning
Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning
Tianbao Xie,Siheng Zhao,5 Auteurs,Tao Yu
2023 · DOI: 10.48550/arXiv.2309.11489
arXiv.org · 83 citations
TLDR
T EXT 2R EWARD is introduced, a data-free framework that automates the generation of dense reward functions based on large language models (LLMs) that produces interpretable, free-form dense reward codes that cover a wide range of tasks, utilize existing packages, and allow iterative refinement with human feedback.
