klotz: logging* + production engineering*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This article details how Nubank built its own in-house logging platform to address issues of cost, scalability, and control over their logging infrastructure. Initially reliant on a vendor solution, they found costs rising unpredictably and experienced limitations in observability and data retention.

    To solve this, Nubank divided the project into two major steps: **The Observability Stream** (ingestion and processing) and the **Query & Log Platform** (storage and querying).

    * **Observability Stream:** Fluent Bit for data collection, a Data Buffer Service for micro-batching, and an in-house Filter & Process Service.
    * **Query & Log Platform:** Trino as the query engine, AWS S3 for storage, and Parquet for data format.

    The new platform currently ingests 1 trillion logs daily, stores 45 PB of searchable data with a 45-day retention, and handles almost 15,000 queries daily. Nubank reports the platform costs 50% less than comparable market solutions while providing them with greater control, scalability, and the ability to customize features. The project underscored Nubank's value of challenging the status quo and leveraging a combination of open-source and in-house development.
  2. A guide to building a robust logging system in Python, covering structured logging, log levels, handlers, formatters, filters, and integrating logging with modern observability practices.
  3. This article provides an overview of OpenTelemetry, an open-source observability framework, and guides on integrating it with Go applications. It covers key concepts like logs, metrics, and traces, and demonstrates setting up a reusable telemetry package using OpenTelemetry in Go.
  4. Ollogger is a powerful, flexible logging application that helps users create custom AI-powered logging assistants. Built with React, TypeScript, and modern web technologies.
  5. Cloudflare, an internet infrastructure and security company, has upgraded its logging pipeline by migrating from syslog-ng to OpenTelemetry Collector. This change affects one of Cloudflare's largest data pipelines, processing millions of log events per second.

    **Key Points**

    * **Motivations:** Language compatibility, easier integration, enhanced metrics, and unified telemetry infrastructure.
    * **Custom Components:** Custom exporter, modified file exporter, processors, and rate-limiters.
    * **Migration and Future Plans:** Careful rollout, monitoring, and plans for more sophisticated log sampling techniques and open-source contributions.
    * **Other Adopters:** Google, Splunk, Shopify, and GitHub are also adopting OpenTelemetry for various use cases.
  6. Use Callbacks to send Output Data to Posthog, Sentry, etc. LiteLLM provides input_callbacks, success_callbacks, and failure_callbacks to easily send data based on response status.
  7. Docker offers various logging drivers that dictate the storage location and format of log messages. These include json-file, syslog, journald, fluentd, awslogs, gelf, logentries, and splunk.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: logging + production engineering

About - Propulsed by SemanticScuttle