This article explores the connection between strategic thinking in games like chess, the behaviors of manipulators (like drug dealers), and the neurological basis of 'theory of mind' – our ability to understand others' perspectives. A recent study suggests a link between forward-thinking and manipulation, with brain activity during negotiation mirroring that seen in chess players.
The article discusses the challenges in aligning artificial agents with human goals and values, highlighting the limitations of current AI alignment approaches like expert trajectory replication and reinforcement learning with human feedback. It argues that a theory of mind, or the ability to understand and evaluate others' beliefs, is essential for achieving true AI alignment.
Researchers tested large language models (LLMs) and humans on a comprehensive battery of theory of mind tasks, revealing differences in their performance on tasks such as understanding false beliefs, recognizing irony, and identifying faux pas.