Home / AWS/ AWS re:Invent 2025: AWS DevOps Agent Enters Preview to Accelerate Incident Response and Improve System Reliability

AWS re:Invent 2025: AWS DevOps Agent Enters Preview to Accelerate Incident Response and Improve System Reliability

December 5, 2025

What is AWS re:Invent?
AWS re:Invent is Amazon Web Services’ largest annual cloud conference, where AWS announces new services, enhancements, and strategic direction for the coming year. The event features keynotes, technical deep dives, hands-on sessions, and hundreds of product launches, making it one of the most influential cloud events in the industry.

What AWS Announced

AWS has introduced AWS DevOps Agent in public preview, a new frontier agent designed to autonomously investigate incidents, identify root causes, and proactively recommend improvements that reduce the likelihood of future issues. Unlike traditional automation or monitoring tools, AWS DevOps Agent operates as an always-on virtual team member that analyzes telemetry, code changes, deployments, and operational patterns across your environment.

 

The service is available at no additional cost during preview in the US East (N. Virginia) Region and can monitor applications deployed across any AWS Region and multiple AWS accounts.

Why This Matters

When production issues occur, on-call engineers must review metrics, logs, deployments, and dependencies under significant pressure and tight timelines. Root cause analysis is often slowed by fragmented toolings, such as observability platforms, CI pipelines, and infrastructure histories. After resolution, teams rarely have time to extract long-term improvements from past incidents.

 

AWS DevOps Agent addresses these challenges with autonomous investigations, correlated insights from the full operational toolchain, and recommendations that strengthen system resilience. This helps reduce mean time to resolution and supports continuous operational improvement without requiring major workflow changes.

Key Capabilities

Autonomous Incident Response

AWS DevOps Agent begins investigating as soon as an alert arrives from services like CloudWatch or from external tools such as ServiceNow or PagerDuty. It correlates metrics, logs, traces, and recent deployments from platforms including GitHub, GitLab, Datadog, Dynatrace, New Relic, and Splunk.

 

The agent identifies probable root causes and updates incident channels in Slack with findings, timelines, and recommendations. It can also act as a virtual incident coordinator by managing communication and stakeholder updates.

Interactive Investigations

On-call teams can use the AWS DevOps Agent web app to trigger investigations, view analysis details, examine the application topology, and ask follow-up questions in natural language. Operators can refine the agent’s analysis by providing additional context, adjusting scopes, or steering the investigation toward specific resources or logs.

Proactive Operational Improvements

Beyond resolving incidents, AWS DevOps Agent detects patterns across historical events to uncover systemic gaps. It provides targeted recommendations in areas such as observability, infrastructure configuration, capacity tuning, testing, and deployment pipeline quality. This helps teams move from reactive processes to proactive reliability engineering.

Intelligent Application Topology

The agent continuously builds and updates a topology graph that maps AWS resources and their relationships, including compute, storage, networking, and deployment histories. This topology allows the agent to understand how changes in one part of the system may influence another during investigations.

Extensible Tool Integrations

Teams can connect additional tools using the Model Context Protocol (MCP), enabling the agent to ingest data from open source platforms like Prometheus and Grafana or internal tooling. This creates a unified investigation surface across complex multicloud and hybrid environments.

How It Works

AWS DevOps Agent organizes work through Agent Spaces, logical containers that define which AWS accounts, tools, and resources a particular agent can access. Administrators configure Agent Spaces and integrations in the AWS Management Console, while operators use the dedicated web app for interactive investigations.

 

Investigations can be started automatically when alarms trigger or manually using predefined investigation paths, such as analyzing recent alarm triggers, high CPU usage, or spikes in application errors.

 

During an investigation, the agent analyzes telemetry, traces application stack relationships, reviews recent deployments, and consolidates findings in a detailed summary that includes root cause candidates and mitigation guidance.

Availability and Preview Details

AWS DevOps Agent is available today in preview in the US East (N. Virginia) Region. The agent itself runs in us-east-1, but it can monitor workloads running in any AWS Region and across multiple AWS accounts. There is no charge during the preview period, although usage limits apply to the number of agent task hours per month.

 

Teams can sign up for the preview and begin integrating their observability and deployment tools through the AWS Management Console.

Official Sources

AWS What’s New: https://aws.amazon.com/about-aws/whats-new/2025/12/devops-agent-preview-frontier-agent-operational-excellence/

DevOps Agent User Guide: https://docs.aws.amazon.com/devopsagent/latest/userguide/what-is.html

AWS Blog: https://aws.amazon.com/blogs/aws/aws-devops-agent-helps-you-accelerate-incident-response-and-improve-system-reliability-preview/ 

 

Forged Concepts

Explore expert cloud, AWS, and DevOps insights by forged Concepts, a trusted AWS MSP

View All Posts