SemanticScuttle - klotz.me » klotz: litellm

klotz: litellm*

Python implementation of Recursive Language Models for processing unbounded context lengths. Process 100k+ tokens with any LLM by storing context as variables instead of prompts.

2026-01-06 Tags: llm, recursive, context, python, litellm, long context, mit, alex zhang by klotz

Server approved! 4xH100 (320gb vram). Looking for advice

A user is seeking advice on deploying a new server with 4x H100 GPUs (320GB VRAM) for on-premise AI workloads. They are considering a Kubernetes-based deployment with RKE2, Nvidia GPU Operator, and tools like vLLM, llama.cpp, and Litellm. They are also exploring the option of GPU pass-through with a hypervisor. The post details their current infrastructure and asks for potential gotchas or best practices.

2025-04-28 Tags: h100, kubernetes, vllm, llama.cpp, gpu, ai, deployment, rke2, litellm, quantization, sxm, fp8, awq, gguf, production engineering, inference engineering, scale, reddit, localllama by klotz

Callbacks - LiteLLM Docs

Use Callbacks to send Output Data to Posthog, Sentry, etc. LiteLLM provides input_callbacks, success_callbacks, and failure_callbacks to easily send data based on response status.

2024-10-23 Tags: litellm, posthog, sentry, langfuse, langsmith, helicone, traceloop, lunary, athina, slack, observability, logging, production engineering by klotz

BerriAI/litellm README.md

LiteLLM is a library to deploy and manage LLM (Large Language Model) APIs using a standardized format. It supports multiple LLM providers, includes proxy server features for load balancing and cost tracking, and offers various integrations for logging and observability.

2024-10-23 Tags: litellm, llm, api, proxy, logging by klotz

discord-llm-chatbot: Talk to LLMs with your friends!

This is a GitHub repository for a Discord bot named discord-llm-chatbot. This bot allows you to chat with Large Language Models (LLMs) directly in your Discord server. It supports various LLMs, including those from OpenAI API, Mistral API, Anthropic API, and local models like ollama, oobabooga, Jan, LM Studio, etc. The bot offers a reply-based chat system, customizable system prompt, and seamless threading of conversations. It also supports image and text file attachments, and streamed responses.

2024-10-22 Tags: discord, bot, chatbot, llm, llava, llamacpp, oobabooga, lm studio, ollama, litellm, llmcord, llama3, gpt-4o by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: litellm*

Linked Tags

Related Tags