This article explores the emerging category of AI-powered operations agents, comparing AI DevOps engineers and AI SRE agents, how cloud providers are responding, and what engineers should consider when evaluating these tools.
Logs, metrics, and traces aren't enough. AI apps require visibility into prompts and completions to track everything from security risks to hallucinations.
This post introduces **GIST (Greedy Independent Set Thresholding)**, a new algorithm for selecting diverse and useful data subsets for machine learning. GIST tackles the NP-hard problem of balancing diversity (minimizing redundancy) and utility (relevance to the task) in large datasets.
**Key points:**
* **Approach:** GIST prioritizes minimum distance between selected data points (diversity) then uses a greedy algorithm to approximate the highest-utility subset within that constraint, testing various distance thresholds.
* **Guarantee:** GIST is guaranteed to find a subset with at least half the value of the optimal solution.
* **Performance:** Experiments demonstrate GIST outperforms existing methods (Random, Margin, k-center, Submod) in image classification and single-shot downsampling.
* **Application:** Already used to improve video recommendation diversity at YouTube.
**GIST provides a mathematically grounded and efficient solution for selecting high-quality data subsets for machine learning, crucial as datasets scale.**
.
Stridetastic is an open-source monitoring and observability framework for Meshtastic® LoRa mesh networks. It helps operators capture, inject, and visualize mesh network data.
Nemo Agent Toolkit simplifies building production-ready LLM applications by providing tools for creating, managing, and deploying agents. It offers features like memory management, tool usage, and observability, making it easier to integrate LLMs into real-world applications.
Cisco and Splunk have introduced the Cisco Time Series Model, a univariate zero shot time series foundation model designed for observability and security metrics. It is released as an open weight checkpoint on Hugging Face.
* **Multiresolution data is common:** The model handles data where fine-grained (e.g., 1-minute) and coarse-grained (e.g., hourly) data coexist, a typical pattern in observability platforms where older data is often aggregated.
* **Long context windows are needed:** It's built to leverage longer historical data (up to 16384 points) than many existing time series models, improving forecasting accuracy.
* **Zero-shot forecasting is desired:** The model aims to provide accurate forecasts *without* requiring task-specific fine-tuning, making it readily applicable to a variety of time series datasets.
* **Quantile forecasting is important:** It predicts not just the mean forecast but also a range of quantiles (0.1 to 0.9), providing a measure of uncertainty.
hl is a fast, Rust-based JSON log viewer designed for efficient processing of structured logs. It offers fast indexing and parsing, enabling quick scanning of large log files.
This article details the steps to move a Large Language Model (LLM) from a prototype to a production-ready system, covering aspects like observability, evaluation, cost management, and scalability.
Ship measurable improvements in your GenAI systems with Opik, your open-source LLM observability and agent optimization platform. Trusted by over 150,000 developers and thousands of companies.
Grafana and GitLab have released a new open-source solution that links GitLab CI/CD events into Grafana's observability stack via a serverless architecture, enabling real-time visibility and correlation between deploy events and performance metrics.