UPDF AI

How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation

Yang Xiao,Yi Cheng,3 Authors,Pengfei Liu

2023 · DOI: 10.48550/arXiv.2312.17115
arXiv.org · 8 Citations

TLDR

Two metrics for assessing LLM-based agent believability are introduced: consistency, and robustness, together with a benchmark, SimulateBench, with which, the consistency and robustness of agents implemented with popular LLMs are evaluated.