Amazon just released the most complete platform for building production AI agents on AWS, and most teams have not heard of it yet.
Amazon Bedrock AgentCore went generally available in late 2025, expanded to new regions through early 2026, and got a wave of major new features in April and May 2026. It is moving fast. This guide covers everything you need to know: what it is, how each piece works, what it costs, and whether it is the right choice for your use case.
Table of Contents
What Is Amazon Bedrock AgentCore?
Amazon Bedrock AgentCore is a fully managed platform for building, deploying, and running AI agents at production scale. You write the agent code, AgentCore handles everything else: infrastructure, session isolation, memory, tool connections, security, scaling, and monitoring.
The short definition: build agents with any framework and any model, ship them to production without managing servers.
Before AgentCore existed, building a production AI agent on AWS meant stitching together many separate pieces yourself. You had to handle session state, build a memory store, wire up tool integrations, configure IAM for every external service, add observability, manage scaling, and keep all of it secure. Teams were spending weeks on infrastructure before writing a single line of agent logic.
AgentCore replaces all of that plumbing with managed services you can adopt one piece at a time.
AgentCore vs Bedrock Agents: What Is the Difference?
This is the first question most people ask, and it is a fair one because the names are confusingly similar.
Bedrock Agents (the older service) is a fully managed, configuration-based tool. You define what your agent can do through the AWS console or API, point it at a Knowledge Base, give it some tools, and AWS runs the whole agent loop for you. You write almost no code. It is fast to set up and works well for straightforward use cases.
Amazon Bedrock AgentCore is a platform layer. You write the agent code yourself using any framework you want (LangGraph, CrewAI, LlamaIndex, Strands Agents, or your own), then deploy it on AgentCore. AgentCore gives you the infrastructure that code runs on, plus memory, a tool gateway, security, and observability. You control the agent logic completely. AgentCore controls the production operations.
A good way to think about it: Bedrock Agents is the managed experience for teams that want to get started fast. AgentCore is the production platform for teams that need full control over agent behavior but do not want to build infrastructure.
You can also use both. Some teams run Bedrock Agents for simple internal tools and deploy custom AgentCore agents for complex customer-facing workflows.
The Seven Components of AgentCore
AgentCore is modular. You can use each piece independently or combine them. Here is what each one does.
1. AgentCore Runtime
Runtime is where your agent code actually runs. It is a serverless compute environment built specifically for agent workloads.
Each user session gets its own microVM with isolated CPU, memory, and filesystem. When the session ends, the microVM is terminated and memory is wiped. This means two users can never see each other’s data, even when running the same agent.
Runtime supports sessions up to 8 hours long, which matters for complex tasks that take time. It handles 100MB payloads, so agents can work with images, audio, and large documents. It scales automatically, and you only pay for the compute your agent actually uses, not idle time.
The versioning system is built in. Every change to your agent configuration creates a new version. You can run multiple versions in parallel (dev, staging, production) and roll back instantly if something breaks.
Runtime works with any framework. You are not locked into Strands, LangChain, or any other specific tool.
2. AgentCore Gateway
Gateway is how your agent connects to the outside world. It turns your existing APIs, Lambda functions, and databases into tools that agents can call.
The problem Gateway solves: agents need tools, but tool integrations are painful to build. You have to handle authentication, error handling, schema validation, and rate limiting for every single external service. Gateway does all of that for you.
You point Gateway at a Lambda function or an OpenAPI spec, and it automatically wraps it into an agent-ready tool with MCP (Model Context Protocol) support. Your agent calls the tool. Gateway handles the connection, the auth, and the logging.
Gateway also supports semantic search across your tools, so when your agent has access to dozens or hundreds of tools, it can find the right one for the task instead of being given a giant list every turn.
3. AgentCore Memory
LLMs forget everything between conversations. AgentCore Memory fixes that.
Memory has two layers:
- Short-term memory stores the immediate conversation context. The agent remembers what was said earlier in this session.
- Long-term memory stores facts, preferences, and summaries across sessions. The agent remembers your name, your shipping address, your allergies, your project history, whatever you tell it.
What makes Memory different from rolling your own with a vector database is the automatic extraction. As conversations happen, AgentCore runs a background process that pulls out useful facts and stores them as structured memory records. You do not have to write the extraction logic. You configure a strategy (semantic, summary, user preferences, or custom) and AgentCore handles the rest.
You can retrieve memories with semantic search, so the agent gets only the relevant facts for the current conversation, not the entire history.
4. AgentCore Identity
Identity gives each agent its own identity, separate from the user it is acting for. This sounds small but it matters a lot in production.
Identity handles two flows:
- Inbound auth: users authenticate into the agent through Okta, Microsoft Entra ID, Cognito, or any standard identity provider. Each user only sees the agents they have permission to use.
- Outbound auth: the agent authenticates to third-party services like Slack, GitHub, or Zoom on the user’s behalf using OAuth or API keys. Credentials are stored securely and never exposed in agent code or logs.
This solves one of the hardest parts of building agents. Without Identity, you end up with messy code that handles tokens, refreshes credentials, and tries not to leak secrets in error messages.
5. AgentCore Observability
Observability tells you what your agent is actually doing in production.
Built on Amazon CloudWatch with OpenTelemetry support, Observability captures every step the agent takes: the reasoning trace, every tool call, every model response, every error. You see the agent’s decision-making process, not just the final output.
This matters because debugging agents is hard. When an agent gives a bad answer, you need to know whether it picked the wrong tool, got bad data from a tool, or just reasoned poorly with the right data. Observability shows you each step so you can fix the actual problem.
6. AgentCore Built-in Tools (Browser and Code Interpreter)
Two tools come built in because almost every agent needs them.
Browser tool: a secure, sandboxed browser the agent can use to navigate websites, fill forms, and extract content. Useful for any workflow that involves a website without an API. Each session runs in its own browser instance, so there is no contamination between users.
Code Interpreter: a sandboxed Python environment for the agent to run code. Useful for data analysis, generating charts, doing math, or anything else where the agent needs to compute something rather than guess.
Both tools follow the same per-second consumption pricing model as Runtime. You only pay when the tool is actively running.
7. AgentCore Policy and Evaluations
Two more recent additions that matter for production deployments.
Policy lets you define rules that constrain what the agent can do. You can write rules in plain English (“agents cannot delete production database records” or “agents cannot make purchases over $500”) and AgentCore translates them into Cedar policies that are enforced at the infrastructure layer. The agent cannot bypass them, even if a user tries to trick it.
Evaluations measure agent quality automatically. It samples production traffic, scores agent responses against quality criteria you define, and flags drift over time. You can also run on-demand evaluations as part of CI/CD before promoting changes to production.
The newer Recommendations feature analyzes traces and suggests specific improvements to your system prompts and tool descriptions, then validates them with batch evaluation or A/B tests before you ship.
What You Can Actually Build with AgentCore
The use cases that make sense for AgentCore are not generic chatbots. They are agents that take action across systems with real consequences.
Customer support agents that answer questions, look up order status, process returns, and escalate to humans when needed. AgentCore handles the per-user session isolation so customer A never sees customer B’s data.
Internal IT helpdesk agents that reset passwords, provision access to tools, troubleshoot common issues, and create tickets when they cannot solve something. Identity handles the auth into Slack, Jira, Okta, and other internal systems.
FinOps and cost management agents that pull AWS Cost Explorer data, identify waste, and recommend optimizations. AWS itself published a reference implementation of this in early 2026.
Sales and marketing agents that draft outreach, qualify leads, and update CRMs. Epsilon reported reducing campaign setup time by 30% and saving 8 hours of manual work per team per week using AgentCore.
Data analysis agents that take a question in plain English, query the right database, run analysis with Code Interpreter, and return charts and summaries. Useful for business users who do not write SQL.
DevOps and incident response agents that monitor systems, investigate alerts, run diagnostic commands, and either resolve issues or page on-call humans with full context.
Coding assistants that work across your codebase, run tests, open PRs, and follow your team’s conventions.
How AgentCore Pricing Works
AgentCore uses consumption-based pricing with no upfront commitments and no minimum fees. You pay per second for what you actually use.
The most important thing to understand: AgentCore charges you only when your agent is actively consuming CPU. When the agent is waiting on an LLM response or a tool call (which is most of the time for typical agents), CPU charges drop to zero. Memory is charged based on peak consumption per second.
This is a meaningful difference from running agents on Lambda or ECS, where you pay for allocated resources during the entire request, including all the waiting time. AgentCore’s pricing model claims to deliver up to 3x lower CPU costs for typical agent workloads.
Per-component pricing breakdown:
- Runtime: per-second vCPU and memory consumption, with a 1-second minimum
- Gateway: per MCP operation (ListTools, CallTool, Ping), per search query, and per indexed tool
- Memory: per memory operation and per record stored
- Identity: per authentication event
- Built-in Tools (Browser and Code Interpreter): per-second active resource consumption
- Observability: standard CloudWatch ingestion and storage charges
- Policy: per authorization request, plus token cost for natural-language policy authoring
- Evaluations: per evaluation token for built-in evaluators
- Agent Registry: free during preview
- The managed harness: free, you only pay for the underlying resources
The model inference itself (calls to Claude, Nova, Llama, etc.) is billed separately through standard Amazon Bedrock pricing. AgentCore charges are on top of those token costs, not a replacement.
For most teams running production agents, the bigger cost line is still model inference. AgentCore’s infrastructure costs are typically 10 to 30% of total agent costs at scale.
Getting Started: The Managed Harness
One of the most useful additions in April 2026 is the managed harness, which removes most of the work to get a working prototype.
With the harness, you define an agent by specifying three things:
- The model you want to use (Claude, Nova, GPT, Llama, etc.)
- The system prompt
- The tools the agent should have access to
That is it. No orchestration code. The harness manages the full agent loop: reasoning, tool selection, action execution, and response streaming. Each session gets its own microVM with filesystem access, so the agent can also work with files during a conversation.
The harness is model-agnostic. You can switch models mid-session if you want to test how different models handle the same task. Any setting you configure when creating the agent can be overridden per invocation, so you can experiment without redeploying.
When you are ready for full control, you can export the harness orchestration as Strands-based code and customize it from there.
Deploying with the AgentCore CLI
The AgentCore CLI launched alongside the harness in April 2026. It deploys agents as infrastructure-as-code with full audit history.
Today the CLI supports AWS CDK as the resource manager, with Terraform support coming soon. It is optimized for use with AI coding assistants (Claude Code, Codex, Cursor, Kiro), with pre-built skills that give the assistant accurate, current AgentCore guidance so the code it generates actually works.
Available in 14 AWS regions at no additional charge.
Security: How AgentCore Protects Your Data
Security in AgentCore is enforced at the infrastructure layer, not just at the application layer. This is important because agents can be tricked through prompt injection. Even if an attacker convinces the agent to do something it should not, AgentCore can stop the action before it happens.
Key security guarantees:
- Session isolation: each user session runs in its own microVM with isolated CPU, memory, and filesystem. Cross-session data contamination is impossible by design.
- Memory sanitization: when a session ends, the entire microVM is terminated and memory is wiped. There is no leftover state.
- Identity-bound access: agents can only access the resources their assigned IAM role permits, regardless of what a user tries to convince them to do.
- Policy enforcement: Cedar policies are evaluated at the Gateway layer for every tool call. If the policy says no, the agent cannot do it.
- Audit logging: every action the agent takes is logged through CloudTrail and Observability for full traceability.
For regulated industries (healthcare, financial services, government), AgentCore inherits Bedrock’s compliance posture: HIPAA eligible, GDPR compliant, SOC 2, ISO 27001, and FedRAMP authorized in GovCloud.
AgentCore vs Building Your Own Agent Stack
Should you use AgentCore or roll your own agent infrastructure? Here is the honest comparison.
Use AgentCore when:
- You want to ship agents to production fast and cannot spend weeks on infrastructure
- You need enterprise security (session isolation, audit logging, compliance certifications) without building it yourself
- Your team is small and cannot maintain custom infrastructure long-term
- You want the option to swap frameworks or models without rebuilding the platform
- Your traffic is variable, so paying only for active consumption beats paying for allocated capacity
Build your own when:
- Your traffic is extremely high and stable, so reserved GPU capacity beats consumption pricing
- You have strict data residency requirements that AgentCore regions do not cover
- You need to run on-premises or air-gapped (not currently supported)
- You need very specific custom routing, safety layers, or experimental models that AWS does not offer
- You have a large platform team that can maintain custom infrastructure as a competitive advantage
For 95% of teams building agents in 2026, AgentCore is the right starting point. You can always migrate off later if your needs outgrow it. Migrating onto a custom platform after you have already built on AgentCore is much easier than migrating onto AgentCore after building everything yourself.
Region Availability and Roadmap
As of May 2026, AgentCore is available in these regions: US East (N. Virginia and Ohio), US West (Oregon), Europe (Ireland and Frankfurt), Asia Pacific (Mumbai, Singapore, Sydney, and Tokyo), and South America (São Paulo).
The managed harness is currently available in four regions and expanding. Evaluations is in nine regions. Most components are available across the full region list above.
AWS has been adding new features roughly every two to three weeks since launch. The most recent additions (Recommendations, A/B testing, Agent Registry) suggest AWS is treating AgentCore as a strategic platform, not a side project.
Frequently Asked Questions
Do I need to use Bedrock models with AgentCore?
No. AgentCore works with any LLM. You can use Claude through Bedrock, GPT through OpenAI, Gemini through Google, Llama self-hosted, or any other model. AgentCore is the platform layer, not the model layer.
Can I use my existing LangChain or CrewAI agent code on AgentCore?
Yes. AgentCore is framework-agnostic and supports LangGraph, LangChain, CrewAI, LlamaIndex, Strands Agents, and custom code. You containerize your agent and deploy it to Runtime.
What is the difference between AgentCore Memory and a vector database?
AgentCore Memory is a managed service that handles raw conversation storage, automatic fact extraction, semantic retrieval, and lifecycle management as one integrated system. A vector database is just one piece of that puzzle. With Memory, you do not write the extraction logic, you do not maintain the vector store, and you do not build the retrieval pipeline.
Does AgentCore work with non-AWS tools?
Yes. Gateway can wrap any HTTP API or OpenAPI spec, not just AWS services. Identity supports OAuth flows for any third-party service like Slack, GitHub, Salesforce, or Zoom.
How is AgentCore different from running agents on Lambda?
Lambda is a generic compute service. AgentCore Runtime is purpose-built for agent workloads, with longer session limits (8 hours vs Lambda’s 15 minutes), per-session microVM isolation, automatic versioning, and consumption-based pricing that does not charge for I/O wait time. For an agent that calls an LLM and waits 5 seconds for a response, AgentCore charges almost nothing for those 5 seconds. Lambda charges full rate.
Can AgentCore agents talk to each other?
Yes. AgentCore supports the Agent-to-Agent (A2A) protocol, so you can build multi-agent systems where a supervisor agent coordinates specialized worker agents. AWS published a reference implementation of this for FinOps in early 2026.
What does it cost to run a typical agent?
For a moderate-traffic customer support agent (say 10,000 conversations per month, 5 turns each), expect roughly $50 to $200 per month in AgentCore infrastructure costs, plus $200 to $800 in Bedrock model inference depending on the model used. Total: usually under $1,000 per month for a real production agent serving thousands of users.
How to Start with AgentCore Today
The fastest path to a working prototype:
- Sign in to the AWS Console and go to Amazon Bedrock
- Make sure you have model access enabled for Claude or another model you want to use
- Open AgentCore in the console and use the managed harness to define your first agent (model + system prompt + tools)
- Test it directly in the console
- When it works, use the AgentCore CLI to deploy it as infrastructure-as-code
- Add Memory, Gateway, Identity, and Observability one at a time as your needs grow
You can have a working agent in under an hour. Building the same thing from scratch on Lambda would take weeks.
Summary
Amazon Bedrock AgentCore is the production platform AWS has been missing for AI agents. It handles the parts that are tedious to build (infrastructure, memory, tool integration, identity, observability) so your team can focus on what makes your agent actually useful: the logic and the prompts.
It works with any framework, any model, and any AWS or third-party tool. It scales automatically. It charges only for what you use. It inherits enterprise security and compliance from AWS. And it is moving fast, with new features shipping every few weeks.
If you are building AI agents on AWS in 2026 and you are not at least evaluating AgentCore, you are probably going to spend a lot of time rebuilding things AWS has already built for you.


