LLM 0.17 release enables multi-modal input, allowing users to send images, audio, and video files to Large Language Models like GPT-4o, Llama, and Gemini, with a Python API and cost-effective pricing.
LM Studio has released lms, a command-line interface (CLI) tool to load/unload models, start/stop the API server, and inspect raw LLM input. It is developed on GitHub and is MIT Licensed.
Simon Willison explains how to use the mistral.rs library in Rust to run the Llama Vision model on a Mac M2 laptop. He provides a detailed example and discusses the memory usage and GPU utilization.
code2prompt is a command-line tool (CLI) that converts your codebase into a single LLM prompt with a source tree, prompt templating, and token counting.
A ruby script calculates VRAM requirements for large language models (LLMs) based on model, bits per weight, and context length. It can determine required VRAM, maximum context length, or best bpw given available VRAM.
Pinboard is a command-line utility for managing file references during raw language model development. It aids in streamlining codebase workflows, offering efficient context-aware file updates.
Noema Research introduces Pinboard, a developer tool for improved productivity. Pinboard, a command-line tool, efficiently manages files and terminal references, enhancing development workflows. Key features include flexible pinning, contextual updates, clipboard integration, an interactive shell, and undo functionality.
Simon Willison recently delivered a talk during the Mastering LLMs: A Conference For Developers & Data Scientists, which was a six-week long online event. The talk centered around Simon's LLM Python command-line utility and its plugins, emphasizing how they can be utilized to explore Large Language Models (LLMs) and perform various tasks. Last week, he discussed accessing LLMs from the command-line, sharing valuable insights and techniques with the audience.
A CLI tool for interacting with local or remote LLMs to retrieve information about files, execute queries, and perform other tasks in a Retrieval-Augmented Generation (RAG) fashion.
Retrochat is chat application that supports Llama.cpp, Kobold.cpp, and Ollama. It highlights new features, commands for configuration, chat management, and models, and provides a download link for the release.