AWS has released the general availability of its DevOps Agent, a generative AI assistant designed to automate incident investigation and operational tasks. Built on Amazon Bedrock AgentCore, the tool integrates with observability platforms, code repositories, and CI/CD pipelines to autonomously triage issues and correlate telemetry data. New capabilities include support for investigating applications in Azure and on-premises environments, custom agent skills, and personalized reporting.
Key highlights:
* Autonomous incident investigation triggered by webhooks from sources like CloudWatch or PagerDuty.
* Integration with major tools including Datadog, Grafana, Splunk, GitHub, and GitLab.
* Reported performance improvements of up to 75% lower MTTR during preview.
* Pricing model based on cumulative time spent on operational tasks per second.
Three vendors โ Cohesity, ServiceNow, and Datadog โ have partnered to create a recoverability service designed to address the risks associated with agentic AI (AIOps). The service aims to restore systems to a "trusted state" by identifying and recovering files and data corrupted by AI errors or malicious attacks.
The companies anticipate increased adoption of agentic AI for system operation but recognize the potential for errors and vulnerabilities. Their solution focuses on preserving immutable snapshots of AI environments, enabling point-in-time recovery of agents, data, and infrastructure components, including vector stores and agent memory.
ServiceNow and Datadog provide control and observability platforms to detect anomalies, triggering API-driven restorations when problems are identified. This offering competes with Rubrik's similar tool and native rollback capabilities from vendors like Cisco. Gartner predicts a significant increase in the integration of task-specific agents in enterprise applications, while Forrester emphasizes the need for guardrails and strong oversight in agentic AI development.
This article explores the emerging category of AI-powered operations agents, comparing AI DevOps engineers and AI SRE agents, how cloud providers are responding, and what engineers should consider when evaluating these tools.