klotz: production engineering*

Production Engineering focuses on the design, implementation, and management of systems and processes to ensure the efficient and reliable delivery of software and services in a production environment. It involves various aspects such as deploying, monitoring, and maintaining applications, managing infrastructure, and handling data pipelines. Production Engineering KPIs include Availability and Cost.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Hallux.ai is a platform offering open-source, LLM-based CLI tools for Linux and MacOS. These tools aim to streamline operations, enhance productivity, and automate workflows for professionals in production engineering, SRE, and DevOps. They also improve Root Cause Analysis (RCA) capabilities and enable self-sufficiency.
  2. Outlier treatment is a necessary step in data analysis. This article, part 3 of a four-part series, eases the process and provides insights on effective methods and tools for outlier detection.
  3. This article explains why CPU limits are considered harmful on Kubernetes. The author provides three analogies about Kubernetes CPU Limits and discusses the best practices for CPU limits and requests on Kubernetes.
  4. A guide to tracking in MLOps, covering code, data, and machine learning model tracking
  5. Plandex is an AI coding agent designed to work directly in the terminal, capable of planning and completing large tasks that span many files and steps. It helps developers build new apps quickly, add features to existing codebases, write tests and scripts, understand code, and fix bugs.
  6. A collection of learning resources for those interested in becoming a Site Reliability Engineer (SRE) at Google, focusing on systems engineering best practices, non-abstract large system design, distributed systems, and reliable data processing.
    2024-07-05 Tags: , , by klotz
  7. This article explains the importance of data validation in a machine learning pipeline and demonstrates how to use TensorFlow Data Validation (TFDV) to validate data. It covers the 5 stages of machine learning validation: generating statistics from training data, inferring schema from training data, generating statistics for evaluation data and comparing it with training data, identifying and fixing anomalies, and checking for drifts and data skew.
  8. Kit is a free, open-source MLOps tool that simplifies AI project management by packaging models, datasets, code, and configurations into a standardized, versioned, and tamper-proof ModelKit. It enables collaboration, model traceability, and reproducibility, making it easier to hand off AI projects between data scientists, developers, and DevOps teams.
    2024-06-22 Tags: , , , , by klotz
  9. Explores KitOps, an open source project that bridges the gap between DevOps and machine learning pipelines by allowing you to leverage existing DevOps pipelines for MLOps tasks.

    ModelKits are standardized packages that contain all the necessary components of an ML project, including the model, datasets, code, and configuration files.

    ModelKits are defined using a YAML file called a Kitfile, which can be integrated seamlessly with existing DevOps pipelines, much like a Dockerfile for containerization.
  10. Explore the best LLM inference engines and servers available to deploy and serve LLMs in production, including vLLM, TensorRT-LLM, Triton Inference Server, RayLLM with RayServe, and HuggingFace Text Generation Inference.
    2024-06-21 Tags: , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: production engineering

About - Propulsed by SemanticScuttle