Tags: production engineering*

Production Engineering focuses on the design, implementation, and management of systems and processes to ensure the efficient and reliable delivery of software and services in a production environment. It involves various aspects such as deploying, monitoring, and maintaining applications, managing infrastructure, and handling data pipelines. Production Engineering KPIs include Availability and Cost.

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs — designed to support data privacy and GDPR compliance. It uses the gemma:3b model running locally via Ollama.
  2. "A fully autonomous, AI-powered DevOps platform for managing cloud infrastructure across multiple providers, with AWS and GitHub integration, powered by OpenAI's Agents SDK."
  3. Why developers are spinning up AI behind your back — and how to detect it. The article discusses the rise of 'Shadow AI' - developers integrating LLMs into production without approval, the risks involved, and strategies for organizations to manage it effectively.

    >We’ve seen LLMs used to auto-tag infrastructure, classify alerts, generate compliance doc stubs, and spin up internal search tools on top of knowledge bases. We’ve also seen them quietly embedded into CI/CD workflows...
  4. Running GenAI models is easy. Scaling them to thousands of users, not so much. This guide details avenues for scaling AI workloads from proofs of concept to production-ready deployments, covering API integration, on-prem deployment considerations, hardware requirements, and tools like vLLM and Nvidia NIMs.
  5. K8S-native cluster-wide deployment for vLLM. Provides a reference implementation for building an inference stack on top of vLLM, enabling scaling, monitoring, request routing, and KV cache offloading with easy cloud deployment.
  6. vLLM Production Stack provides a reference implementation on how to build an inference stack on top of vLLM, allowing for scalable, monitored, and performant LLM deployments using Kubernetes and Helm.
  7. A user is seeking advice on deploying a new server with 4x H100 GPUs (320GB VRAM) for on-premise AI workloads. They are considering a Kubernetes-based deployment with RKE2, Nvidia GPU Operator, and tools like vLLM, llama.cpp, and Litellm. They are also exploring the option of GPU pass-through with a hypervisor. The post details their current infrastructure and asks for potential gotchas or best practices.
  8. An in-depth look at Choreo, an open-source Internal Developer Platform (IDP) built on Kubernetes and GitOps, utilizing 20+ CNCF tools to provide a secure, scalable, and developer-friendly experience. The article discusses the challenges of Kubernetes management, the illusion of 'platformless' solutions, and how Choreo aims to bridge the gap between developer freedom and enterprise requirements.
  9. Companies are increasingly moving away from cloud computing to on-premises servers to lower costs and regain control over their operations.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "production engineering"

About - Propulsed by SemanticScuttle