LlamaBarn is a macOS menu bar app for running local LLMs. It provides a simple way to install and run models locally, connecting to apps via an OpenAI-compatible API.
This discussion details performance benchmarks of llama.cpp on an NVIDIA DGX Spark, including tests for various models (gpt-oss-20b, gpt-oss-120b, Qwen3, Qwen2.5, Gemma, GLM) with different context depths and batch sizes.