A summary of a workshop presented at PyCon US on building software with LLMs, covering setup, prompting, building tools (text-to-SQL, structured data extraction, semantic search/RAG), tool usage, and security considerations like prompt injection. It also discusses the current LLM landscape, including models from OpenAI, Gemini, Anthropic, and open-weight alternatives.
This article details a new plugin, llm-video-frames, that allows users to feed video files into long context vision LLMs (like GPT-4.1) by converting them into a sequence of JPEG frames. It showcases how to install and use the plugin, provides examples with the Cleo video, and discusses the cost and technical details of the process. It also covers the development of the plugin using an LLM and highlights other features in LLM 0.25.
An analysis of the recent paper 'The Leaderboard Illusion' which critiques the Chatbot Arena's LLM evaluation methodology, focusing on issues with private testing, unfair sampling, and potential gaming of the leaderboard. It also explores OpenRouter as a potential alternative ranking system.
Alibaba’s Qwen team released the Qwen 3 model family, offering a range of sizes and capabilities. The article discusses the model's features, performance, and the well-coordinated release across the LLM ecosystem, highlighting the trend of better models running on the same hardware.
Google's Gemini 2.5 Flash model is a new, faster, and more cost-effective model with adjustable 'thinking' capabilities. The article details how to use it with llm-gemini, explores pricing differences compared to Gemini 2.0 Flash, and shares example SVG outputs.
LLM 0.24 introduces fragments and template plugins to better utilize long context models, improving storage efficiency and enabling new features like querying logs by fragment and leveraging documentation. It also details improvements to template handling and model support.
A review of the Qwen2.5-VL-32B large language model, noting its performance, capabilities, and how it runs on a 64GB Mac. Includes a demonstration with a map image and performance statistics.
Simon Willison discusses his experience using Large Language Models (LLMs) for coding, providing detailed advice on how to effectively use LLMs to augment coding abilities, set reasonable expectations, manage context, and more.
A guide on using large language models (LLMs) for programming tasks, including examples, strategies, and useful tips for effectively using AI assistants like ChatGPT and Claude.
Simon Willison discusses the release of llm-anthropic 0.14, which adds support for Claude 3.7 Sonnet's new features. Key features include extended thinking mode, a massive increase in output limits, and improved support for long tasks. The article also covers the plugin's implementation details and limitations.