Google DeepMind has introduced AlphaEvolve, an LLM-powered evolutionary coding agent that automates the design of algorithms for Multi-Agent Reinforcement Learning (MARL) in imperfect-information games. Using Gemini 2.5 Pro to mutate Python source code, the system discovered two novel algorithms: VAD-CFR and SHOR-PSRO. These evolved algorithms matched or surpassed state-of-the-art hand-designed baselines in various scenarios, including poker and Liars Dice. The research highlights the ability of automated search to discover non-intuitive mechanisms, such as volatility-adaptive discounting and hybrid meta-solvers, which generalize effectively to larger, unseen games, proving that LLMs can evolve complex algorithmic logic more efficiently than manual human iteration.
>As autonomous agents powered by LLM are increasingly deployed in society, understanding their collective behaviour in social dilemmas becomes critical. We introduce an evaluation framework where LLMs generate strategies encoded as algorithms, enabling inspection prior to deployment and scaling to populations of hundreds of agents—substantially larger than in previous work. We find that more recent models tend to produce worse societal outcomes compared to older models when agents prioritise individual gain over collective benefits. Using cultural evolution to model user selection of agents, our simulations reveal a significant risk of convergence to poor societal equilibria, particularly when the relative benefit of cooperation diminishes and population sizes increase. We release our code as an evaluation suite for developers to assess the emergent collective behaviour of their models
Google DeepMind research reveals a fundamental architectural limitation in Retrieval-Augmented Generation (RAG) systems related to fixed-size embeddings. The research demonstrates that retrieval performance degrades as database size increases, with theoretical limits based on embedding dimensionality. They introduce the LIMIT benchmark to empirically test these limitations and suggest alternatives like cross-encoders, multi-vector models, and sparse models.
Researchers from Google DeepMind have developed Differentiable Cache Augmentation, a method that uses a coprocessor to augment LLM's key-value cache with latent embeddings, enhancing reasoning capabilities without increasing computational burden.
"The methodology revolves around a three-stage process. First, the frozen LLM generates a kv-cache from an input sequence, encapsulating its internal representation. This kv-cache is passed to the coprocessor, which processes it with additional trainable soft tokens. Not tied to specific words, these tokens act as abstract prompts for generating latent embeddings. Once processed, the augmented kv-cache is fed back into the LLM, enabling it to generate contextually enriched outputs. This asynchronous operation ensures the coprocessor’s enhancements are applied efficiently without delaying the LLM’s primary functions. Training the coprocessor is conducted using a language modeling loss, focusing solely on its parameters while preserving the integrity of the frozen LLM. This targeted approach allows for scalable and effective optimization."