0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
The Cerebras API offers low-latency AI model inference using Cerebras Wafer-Scale Engines and CS-3 systems, providing access to Meta's Llama models for conversational applications.
TabbyAPI is a FastAPI based application that allows for generating text using an LLM (large language model) using the Exllamav2 backend. It supports various model types and features like HuggingFace model downloading, embedding model support, and more.
First / Previous / Next / Last
/ Page 1 of 0