SemanticScuttle - klotz.me

Tags: metrics*

0 bookmark(s) - Sort by: Date ↓ / Title /

LLMs create a new blind spot in observability

Logs, metrics, and traces aren't enough. AI apps require visibility into prompts and completions to track everything from security risks to hallucinations.

2026-01-25 Tags: llm, observability, metrics, logs, traces, prompts, hallucinations, rag, cybersecurity by klotz

Cisco Released Cisco Time Series Model: Their First Open-Weights Foundation Model based on Decoder-only Transformer Architecture

Cisco and Splunk have introduced the Cisco Time Series Model, a univariate zero shot time series foundation model designed for observability and security metrics. It is released as an open weight checkpoint on Hugging Face.

* **Multiresolution data is common:** The model handles data where fine-grained (e.g., 1-minute) and coarse-grained (e.g., hourly) data coexist, a typical pattern in observability platforms where older data is often aggregated.
* **Long context windows are needed:** It's built to leverage longer historical data (up to 16384 points) than many existing time series models, improving forecasting accuracy.
* **Zero-shot forecasting is desired:** The model aims to provide accurate forecasts *without* requiring task-specific fine-tuning, making it readily applicable to a variety of time series datasets.
* **Quantile forecasting is important:** It predicts not just the mean forecast but also a range of quantiles (0.1 to 0.9), providing a measure of uncertainty.

2025-12-09 Tags: time series, foundation model, transformer, cisco, splunk, observability, metrics, machine learning, llm by klotz

Logs, Metrics & Traces: A Before and After Story

The company's transition from fragmented observability tools to a unified system using OpenTelemetry and OneUptime dramatically improved incident response times, reducing MTTR from 41 to 9 minutes. By correlating logs, metrics, and traces through structured logging and intelligent sampling, they eliminated much of the noise and confusion that previously slowed root cause analysis. The shift also reduced the number of dashboards engineers needed to check per incident and significantly lowered the percentage of incidents with unknown causes.

Key practices included instrumenting once with OpenTelemetry, enforcing cardinality limits, and archiving raw data for future analysis. The move away from 100% trace capture and over-instrumentation helped manage data volume while maintaining visibility into anomalies. This transformation emphasized that effective observability isn't about collecting more data, but about designing correlated signals that support intentional diagnosis and reduce cognitive load.

2025-08-21 Tags: observability, opentelemetry, logs, metrics, traces, production engineering by klotz

OpenTelemetry: A Guide to Observability with Go

This article provides an overview of OpenTelemetry, an open-source observability framework, and guides on integrating it with Go applications. It covers key concepts like logs, metrics, and traces, and demonstrates setting up a reusable telemetry package using OpenTelemetry in Go.

2025-02-07 Tags: golang, opentelemetry, observability, logging, metrics, distributed tracing, go, grafana, production engineering by klotz

What Is OpenTelemetry? The Ultimate Guide

OpenTelemetry is not just an observability platform, it's a set of best practices and standards that can be integrated into platform engineering or DevOps.

2024-08-26 Tags: opentelemetry, observability, telemetry data, golden signals, metrics, logs, out traces, platform engineering, production engineering by klotz

Metrics to Evaluate a Classification Machine Learning Model

This article explores various metrics used to evaluate the performance of classification machine learning models, including precision, recall, F1-score, accuracy, and alert rate. It explains how these metrics are calculated and provides insights into their application in real-world scenarios, particularly in fraud detection.

2024-08-01 Tags: machine learning, classification, metrics, evaluation, precision, recall, f1-score, accuracy, alert rate, fraud detection, llm by klotz

It’s Time to Finally Memorize Those Dang Classification Metrics!

This article discusses the importance of understanding and memorizing classification metrics in machine learning. The author shares their own experience and strategies for memorizing metrics such as accuracy, precision, recall, F1 score, and ROC AUC.

2024-06-24 Tags: classification, metrics, machine learning, data science, precision, recall, accuracy, roc, auc by klotz

Analysing Interactions with Friedman’s H-stat and Python

The article explains how to apply Friedman's h-statistic to understand if complex machine learning models use interactions to make predictions. It uses the artemis package and interprets the pairwise, overall, and unnormalised metrics.

2024-06-21 Tags: statistics, metrics, friedman_s h-statistic, machine learning, interactions, artemis, pairwise by klotz

How to log output of running models and performance monitoring

A discussion post on Reddit's LocalLLaMA subreddit about logging the output of running models and monitoring performance, specifically for debugging errors, warnings, and performance analysis. The post also mentions the need for flags to output logs as flat files, GPU metrics (GPU utilization, RAM usage, TensorCore usage, etc.) for troubleshooting and analytics.

2024-06-12 Tags: llama, python, logging, performance, monitoring, gpu, metrics, debugging, nvidia, analytics, product lion engineering, llms by klotz

Metrics, Traces, Logs — And Now, OpenTelemetry Profile Data

With the addition of profiling to OpenTelemetry, we expect continuous production profiling to hit the mainstream.

2024-06-01 Tags: metrics, traces, logs, opentelemetry, profiling, observability, ebpf, production engineering by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: metrics*

Linked Tags

Related Tags