Simon Willison reviews the new Qwen2.5-Coder-32B, an open-source LLM by Alibaba, which performs well on various coding benchmarks and can run on personal devices like his MacBook Pro M2.
LLM 0.17 release enables multi-modal input, allowing users to send images, audio, and video files to Large Language Models like GPT-4o, Llama, and Gemini, with a Python API and cost-effective pricing.
A new plugin for LLM, llm-jq, generates and executes jq programs based on human-language descriptions, allowing users to manipulate JSON data without needing to write jq syntax.
Simon Willison explains how to use the mistral.rs library in Rust to run the Llama Vision model on a Mac M2 laptop. He provides a detailed example and discusses the memory usage and GPU utilization.
The author records a screen capture of their Gmail account and uses Google Gemini to extract numeric values from the video.
Simon Willison recently delivered a talk during the Mastering LLMs: A Conference For Developers & Data Scientists, which was a six-week long online event. The talk centered around Simon's LLM Python command-line utility and its plugins, emphasizing how they can be utilized to explore Large Language Models (LLMs) and perform various tasks. Last week, he discussed accessing LLMs from the command-line, sharing valuable insights and techniques with the audience.
Simon Willison explains an accidental prompt injection attack on RAG applications, caused by concatenating user questions with documentation fragments in a Retrieval Augmented Generation (RAG) system.
Mixtral 8x7B:
Use llm-llama-cpp plugin.
Download a GGUF file for Mixtral 8X7B Instruct v0.1.
Run the model using llm -m gguf with the downloaded file.
Llamafile is the new best way to run a Large Language Model on your own computer, offering a single, self-contained file that can be used forever on almost any computer, without needing a network connection
The author created several GPTs, including a Dejargonizer that decodes technical terms, a JavaScript Code Interpreter that runs JavaScript code, a Dependency Chat that identifies project dependencies, and a fun Animal Chefs GPT that generates recipes with animal-themed stories.