Simple, unified interface to multiple Generative AI providers, supporting various providers including OpenAI, Anthropic, Azure, Google, AWS, Groq, Mistral, HuggingFace, and Ollama. It aims to facilitate the use of multiple LLMs with a standardized interface similar to OpenAI’s.
llama-cpp-python offers a web server which aims to act as a drop-in replacement for the OpenAI API. This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).