UPDF AI

Evaluating large language models in theory of mind tasks

Michal Kosinski

2023 · DOI: 10.1073/pnas.2405460121
Proceedings of the National Academy of Sciences of the United States of America · 213 citations

TLDR

The results show that recent large language models can solve false-belief tasks, typically used to evaluate ToM in humans, and signify the advent of more powerful and socially skilled AI—with profound positive and negative implications.

Résumé

Significance Humans automatically and effortlessly track others’ unobservable mental states, such as their knowledge, intentions, beliefs, and desires. This ability—typically called “theory of mind” (ToM)—is fundamental to human social interactions, communication, empathy, consciousness, moral judgment, and religious beliefs. Our results show that recent large language models (LLMs) can solve false-belief tasks, typically used to evaluate ToM in humans. Regardless of how we interpret these outcomes, they signify the advent of more powerful and socially skilled AI—with profound positive and negative implications.