This paper explores the cultural evolution of cooperation among LLM agents through a variant of the Donor Game, finding significant differences in cooperative behavior across various base models and initial strategies.
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
How computationally optimized prompts make language models excel, and how this all affects prompt engineering
A detailed analysis of the DeepMind/Meta study: how large language models achieve unprecedented compression rates on text, image, and audio data - and the implications of these results