Simon Willison reviews the new Qwen2.5-Coder-32B, an open-source LLM by Alibaba, which performs well on various coding benchmarks and can run on personal devices like his MacBook Pro M2.
LLM 0.17 release enables multi-modal input, allowing users to send images, audio, and video files to Large Language Models like GPT-4o, Llama, and Gemini, with a Python API and cost-effective pricing.
A new plugin for LLM, llm-jq, generates and executes jq programs based on human-language descriptions, allowing users to manipulate JSON data without needing to write jq syntax.
Simon Willison explains how to use the mistral.rs library in Rust to run the Llama Vision model on a Mac M2 laptop. He provides a detailed example and discusses the memory usage and GPU utilization.
The author records a screen capture of their Gmail account and uses Google Gemini to extract numeric values from the video.
Datasette is introduced as a functional interactive frontend to tabulated data, either in CSV format or a database schema, catering to data journalists, museum curators, archivists, local governments, and researchers.
The author explores creating tables and inserting data into a SQLite database, then targets the database with Datasette to showcase how errors in data can be identified and corrected.
Simon Willison recently delivered a talk during the Mastering LLMs: A Conference For Developers & Data Scientists, which was a six-week long online event. The talk centered around Simon's LLM Python command-line utility and its plugins, emphasizing how they can be utilized to explore Large Language Models (LLMs) and perform various tasks. Last week, he discussed accessing LLMs from the command-line, sharing valuable insights and techniques with the audience.
Simon Willison shares a scraping technique called Git scraping, where data is scraped and tracked over time by committing the changes to a Git repository. He demonstrates the technique using an example of California fires data from CAL FIRE website.
Simon Willison explains an accidental prompt injection attack on RAG applications, caused by concatenating user questions with documentation fragments in a Retrieval Augmented Generation (RAG) system.
* **http.server**: Run a localhost web server on port 8000: `python -m http.server`
* **base64**: Encode/decode base64: `python -m base64 -h`
* **asyncio**: Python console with top-level `await`: `python -m asyncio`
* **tokenize**: Debug mode for Python tokenizer: `python -m tokenize cgi.py`
* **ast**: Debug mode for Python AST module: `python -m ast cgi.py`
* **json.tool**: Pretty-print JSON: `echo '{"foo": "bar"}' | python -m json.tool`
* **random**: Benchmarking suite for random number generators (fixed in Python 3.13)
* **nntplib**: Display latest articles in a newsgroup: `python -m nntplib`
* **calendar**: Show a calendar for the current year: `python -m calendar` (with options like `-t html`)