Three vendors – Cohesity, ServiceNow, and Datadog – have partnered to create a recoverability service designed to address the risks associated with agentic AI (AIOps). The service aims to restore systems to a "trusted state" by identifying and recovering files and data corrupted by AI errors or malicious attacks.
The companies anticipate increased adoption of agentic AI for system operation but recognize the potential for errors and vulnerabilities. Their solution focuses on preserving immutable snapshots of AI environments, enabling point-in-time recovery of agents, data, and infrastructure components, including vector stores and agent memory.
ServiceNow and Datadog provide control and observability platforms to detect anomalies, triggering API-driven restorations when problems are identified. This offering competes with Rubrik's similar tool and native rollback capabilities from vendors like Cisco. Gartner predicts a significant increase in the integration of task-specific agents in enterprise applications, while Forrester emphasizes the need for guardrails and strong oversight in agentic AI development.
This paper investigates how large language models (LLMs) solve mental math problems. It proposes that meaningful computation occurs late in the network (in terms of layer depth) and primarily at the last token, receiving information from other tokens in specific middle layers. The authors introduce techniques (CAMA and ABP) to identify an 'All-for-One' subgraph responsible for this behavior, demonstrating its sufficiency and necessity for high performance across various models and input styles.
This paper introduces Toto, a time series forecasting foundation model with 151 million parameters, and BOOM, a large-scale benchmark for observability time series data. Toto uses a decoder-only architecture and is trained on a large corpus of observability, open, and synthetic data. Both Toto and BOOM are open-sourced under the Apache 2.0 License.
Datadog announces the release of Toto, a state-of-the-art open-weights time series foundation model, and BOOM, a new observability benchmark. Toto achieves SOTA performance on observability metrics, and BOOM provides a challenging dataset for evaluating time series models in the observability domain.
Datadog has acquired Metaplane to expand its data observability offerings, particularly for AI applications. Metaplane uses AI-powered anomaly detection and data lineage tracking. The acquisition aims to unify observability across applications and data.
Sawmills AI has introduced a smart telemetry data management platform aimed at reducing costs and improving data quality for enterprise observability. By acting as a middleware layer that uses AI and ML to optimize telemetry data before it reaches vendors like Datadog and Splunk, Sawmills helps companies manage data efficiently, retain data sovereignty, and reduce unnecessary data processing costs.
OpenTelemetry offers a standardized process for observability, but its functionality is a work in progress. Its usefulness depends on the observability tools and platforms used in conjunction with OpenTelemetry.