A collection of Python examples demonstrating the use of Mistral.rs, a Rust library for working with mistral models.
Mistral AI has introduced two methods for creating custom AI agents: La Plateforme Agent Builder, a user-friendly interface, and Agent API, a programmatic solution. This allows users to create and configure agents using Mistral's AI models or fine-tuned models.
This article compares the performance of smaller language models Gemma, Llama 3, and Mistral on reading comprehension tasks. The author highlights the trend of smaller, more accessible models and discusses Apple's recent foray into the field with its own proprietary model.
An article on how to properly prompt the Mistral AI Instruct models, explaining the role of BOS, INST, and other special tokens.
This model was converted to GGUF format from Joseph717171/Mistral-12.25B-v0.2 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.
This article explains how to install Ollama, an open-source project for running large language models (LLMs) on a local machine, on Ubuntu Linux. It also covers the system requirements, installation process, and usage of various available LLMs.
A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.
This article describes the process of implementing function-calling in an AI system, specifically using the Mistral AI platform. The example showcases the development of an assistant that can manage a home automation system through natural language interactions with the user, including the use of available functions, function logic, and the integration of these functions into the AI system.
Mistral.rs is a fast LLM inference platform supporting inference on a variety of devices, quantization, and easy-to-use application with an Open-AI API compatible HTTP server and Python bindings. It supports the latest Llama and Phi models, as well as X-LoRA and LoRA support. The project aims to provide the fastest LLM inference platform possible.