llama-swap is a lightweight, transparent proxy server that provides automatic model swapping to llama.cpp's server. It allows you to easily switch between different language models on a local server, supporting OpenAI API compatible endpoints and offering features like model grouping, automatic unloading, and a web UI for monitoring.