SemanticScuttle - klotz.me » klotz: inference+llm+api

klotz: inference* + llm* + api*

Cerebras Inference Overview

The Cerebras API offers low-latency AI model inference using Cerebras Wafer-Scale Engines and CS-3 systems, providing access to Meta's Llama models for conversational applications.

2025-02-08 Tags: cerebras, api, llm, inference, cs-3 systems, saas by klotz
TabbyAPI - An OAI compatible exllamav2 API that's both lightweight and fast

TabbyAPI is a FastAPI based application that allows for generating text using an LLM (large language model) using the Exllamav2 backend. It supports various model types and features like HuggingFace model downloading, embedding model support, and more.

2024-09-25 Tags: llm, inference, tabbyapi, api, exllamav2 by klotz

First / Previous / Next / Last / Page 1 of 0