Unusually detailed post explains how OpenAI handles the Codex agent loop. The article dives into the technical aspects of OpenAI's Codex CLI coding agent, including the agent loop, prompt construction, caching, and context window management.
The article details how their Codex CLI coding agent functions. OpenAI engineer Michael Bolin explains the "agent loop" – the process by which the AI receives user input, generates code, runs tests, and iterates with human supervision.
* **Agent Loop Mechanics:** The agent builds prompts with prioritized components (system, developer, user, assistant) and sends them to OpenAI’s Responses API.
* **Prompt Management:** The system handles growing prompt lengths (quadratic growth) through caching, compaction, and a stateless API design (allowing for "Zero Data Retention"). Cache misses can significantly impact performance.
* **Context Window:** Codex automatically compacts conversations to stay within the AI model's context window.
* **Open Source Focus:** OpenAI open-sources the CLI client for Codex, unlike ChatGPT, suggesting a different approach to development and transparency for coding tools.
* **Challenges Acknowledged:** The article doesn't shy away from the engineering challenges, like performance issues and bugs encountered during development.
* **Future Coverage:** Bolin plans to release further posts detailing the CLI’s architecture, tool implementation, and sandboxing model.
Researchers are studying large language models as if they were living things, discovering secrets by applying biological and neurological analysis techniques. This approach is revealing unexpected behaviors and limitations of LLMs.
Simon Willison’s annual review of the major trends, breakthroughs, and cultural moments in the large language model ecosystem in 2025, covering reasoning models, coding agents, CLI tools, Chinese open‑weight models, image editing, academic competition wins, and the rise of AI‑enabled browsers.
The article discusses the increasing usefulness of running AI models locally, highlighting benefits like latency, privacy, cost, and control. It explores practical applications such as data processing, note-taking, voice assistance, and self-sufficiency, while acknowledging the limitations compared to cloud-based models.
A detailed review of GPT-5.2, covering its Thinking and Pro modes, code generation, vision, long-context capabilities, speed, and comparisons to Claude Opus 4.5 and Gemini 3 Pro.
Over the last year, MCP accomplished a rapid rise to popularity that few other standards or technologies have achieved so quickly. This article details the unlikely rise of the Model Context Protocol (MCP) and its journey to becoming a generally accepted standard for AI connectivity.
LLM Council works together to answer your hardest questions. A local web app that uses OpenRouter to send queries to multiple LLMs, have them review/rank each other's work, and finally a Chairman LLM produces the final response.
OpenAI releases GPT-5.1 Instant and GPT-5.1 Thinking, upgrades to the GPT-5 series focusing on improved intelligence, conversational style, and customization options for ChatGPT. Includes new tone presets and the ability to fine-tune characteristics.
OpenAI releases gpt-oss-safeguard, an open-source AI model for content moderation that allows developers to define their own safety policies instead of relying on pre-trained models. It operates by reasoning about content based on custom policies, offering a more flexible and nuanced approach to moderation.
An in-depth look at the architecture of OpenAI's GPT-OSS models, detailing tokenization, embeddings, transformer blocks, Mixture of Experts, attention mechanisms (GQA and RoPE), and quantization techniques.