Llama Stack v0.1.0 introduces a stable API release enabling developers to build RAG applications and agents, integrate with various tools, and use telemetry for monitoring and evaluation. This release provides a comprehensive interface, rich provider ecosystem, and multiple developer interfaces, along with sample applications for Python, iOS, and Android.
Meta has launched Llama-Stack 0.1.0, a development platform designed to simplify the process of building AI applications using Llama models. The platform offers standardized building blocks and flexible deployment options, including remote and local hosting. It features a plugin system for various API providers and supports multiple programming environments with its CLI tools and SDKs. Meta aims to address common challenges faced by AI developers, such as integrating tools and managing data sources.
Sparse autoencoders (SAEs) have been trained on Llama 3.3 70B, releasing an interpreted model accessible via API, enabling research and product development through feature space exploration and steering.
MCP is an open-source standard that enhances interaction between AI systems and various data sources, improving usability, response quality, and security.
GitHub Models now allows developers to retrieve structured JSON responses from models directly in the UI, improving integration with applications and workflows. Supported models include OpenAI (except for o1-mini and o1-preview) and Mistral models.
Simple, unified interface to multiple Generative AI providers, supporting various providers including OpenAI, Anthropic, Azure, Google, AWS, Groq, Mistral, HuggingFace, and Ollama. It aims to facilitate the use of multiple LLMs with a standardized interface similar to OpenAI’s.
LiteLLM is a library to deploy and manage LLM (Large Language Model) APIs using a standardized format. It supports multiple LLM providers, includes proxy server features for load balancing and cost tracking, and offers various integrations for logging and observability.
This page provides documentation for the rerank API, including endpoints, request parameters, and response formats.
This repository contains the Llama Stack API specifications as well as API Providers and Llama Stack Distributions. The Llama Stack aims to standardize the building blocks needed for generative AI applications across various development stages.
It includes API specifications and providers for the Llama Stack, which aims to standardize components needed for developing generative AI applications. The stack includes APIs for Inference, Safety, Memory, Agentic System, Evaluation, Post Training, Synthetic Data Generation, and Reward Scoring. Providers offer actual implementations for these APIs, either through open-source libraries or remote REST services.
TabbyAPI is a FastAPI based application that allows for generating text using an LLM (large language model) using the Exllamav2 backend. It supports various model types and features like HuggingFace model downloading, embedding model support, and more.