A detailed comparison of the architectures of recent large language models (LLMs) including DeepSeek-V3, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen3, SmolLM3, and Kimi 2, focusing on key design choices and their impact on performance and efficiency.
Learn how to run and fine-tune Mistral Devstral 1.1, including Small-2507 and 2505. This guide covers official recommended settings, tutorials for running Devstral in Ollama and llama.cpp, experimental vision support, and fine-tuning with Unsloth.
A set of tools to help you work with Mistral models, including tokenization, validation, and normalization code.
A collection of Python examples demonstrating the use of Mistral.rs, a Rust library for working with mistral models.
Mistral AI has introduced two methods for creating custom AI agents: La Plateforme Agent Builder, a user-friendly interface, and Agent API, a programmatic solution. This allows users to create and configure agents using Mistral's AI models or fine-tuned models.
This article compares the performance of smaller language models Gemma, Llama 3, and Mistral on reading comprehension tasks. The author highlights the trend of smaller, more accessible models and discusses Apple's recent foray into the field with its own proprietary model.
An article on how to properly prompt the Mistral AI Instruct models, explaining the role of BOS, INST, and other special tokens.
This model was converted to GGUF format from Joseph717171/Mistral-12.25B-v0.2 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.
This article explains how to install Ollama, an open-source project for running large language models (LLMs) on a local machine, on Ubuntu Linux. It also covers the system requirements, installation process, and usage of various available LLMs.
A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.