SemanticScuttle - klotz.me » klotz: reddit+production engineering

klotz: reddit* + production engineering*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

Server approved! 4xH100 (320gb vram). Looking for advice

A user is seeking advice on deploying a new server with 4x H100 GPUs (320GB VRAM) for on-premise AI workloads. They are considering a Kubernetes-based deployment with RKE2, Nvidia GPU Operator, and tools like vLLM, llama.cpp, and Litellm. They are also exploring the option of GPU pass-through with a hypervisor. The post details their current infrastructure and asks for potential gotchas or best practices.

2025-04-28 Tags: h100, kubernetes, vllm, llama.cpp, gpu, ai, deployment, rke2, litellm, quantization, sxm, fp8, awq, gguf, production engineering, inference engineering, scale, reddit, localllama by klotz
Claude Code saved the day and my sanity :)

A developer recounts how Claude Code helped resolve a critical memory usage issue in an API endpoint, reducing memory usage by 99% and providing detailed solutions and evidence.

2025-03-11 Tags: claude, production engineering, performance analysis, devops, reddit by klotz
How to get a Kubernetes environment on Ubuntu 22.04 for dev/tinkering purposes ready?

2022-09-28 Tags: kubernetes, ubuntu, 22.04, hacks, homelab, reddit, production engineering by klotz
Splunk Cloud and CloudFlare logs

2022-02-04 Tags: splunk, logs, cloudflare, reddit, zero trust, production engineering by klotz
How do you keep track of the state of your cluster?

2018-09-15 Tags: kubernetes, monitoring, configuration, helm, reddit, production engineering by klotz
My Kubernetes homelab - homelab

2018-09-12 Tags: kubernetes, homelab, reddit, production engineering by klotz
An alternative to fat JARs : java

2016-06-20 Tags: java, deployment, fatjar, reddit, hubspot, maven, production engineering by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0

About - Propulsed by SemanticScuttle