AWS Agentic AI Security Scoping Matrix: Scopes to Controls

The Agentic AI Security Scoping Matrix is a classification framework AWS published on its Security Blog on November 21, 2025, sorting agentic AI systems into four scopes of agency — Scope 1 (No Agency) through Scope 4 (Full Agency) — with six security dimensions that apply differently at each level. Written by Aaron Brown and Matt Saner, it answers the question that lands on every security architect's desk the week an agent pilot goes live: what scope are our agents, and what does that scope demand from us?

AWS answered the first half well. The scope definitions are clean, and the framework was picked up fast — the Cloud Security Alliance published a formal enhancement within a month. What almost nobody has written is the second half: the operational mapping. For each scope, which controls do you actually stand up? Which gateway policy, which hardening step, which injection defense? That mapping is this page.

One observation before the definitions. We have run scope-classification exercises with three enterprise teams since the matrix shipped, and every one under-classified at least one agent. The deployment described as "just a chatbot" held a write-capable ticketing tool nobody remembered granting. Keep that in mind as you read.

Why AWS Built a Second Scoping Matrix

AWS has been here before. In October 2023, Matt Saner and Mike Lapidakis published the Generative AI Security Scoping Matrix: five scopes based on ownership, from Scope 1 (consuming a public third-party AI app) up through Scope 5 (training your own model from scratch). The organizing question was who built and operates the model, because in 2023 that was the variable that moved your security responsibilities.

Agents broke that axis. Two deployments of the same foundation model, on the same account, in the same VPC, can carry wildly different risk — one only answers questions, the other can open pull requests, move money, or email your customers. The variable that matters is no longer ownership. It is what the system may do without a human in the loop. The 2025 matrix reorganizes the threat model around that question, and it complements rather than replaces the 2023 version: the old matrix places responsibility for the model, the new one scopes the agency wrapped around it.

The autonomy axis also changes what a control even is. In a chat deployment, a bad output is a bad paragraph. In an agentic system, a bad output is an action — and actions need perimeters, approvals, and kill switches, not just content filters.

The Four Scopes, Defined

Definitions first, commentary second — these are the sentences you will end up pasting into your own governance docs, so they need to be exact.

Scope 1: No Agency

A Scope 1 system is human-initiated and has no ability to change anything — in AWS's words, "The agents are, essentially, read-only." The agent can search, retrieve, summarize, and recommend, but nothing it does modifies the environment. AWS's running example is an assistant that searches calendars and suggests meeting times but cannot book anything. Most enterprise search and RAG assistants live here — as long as every connected tool really is read-only, which is the thing to verify rather than assume.

Scope 2: Prescribed Agency

A Scope 2 system can prepare changes to the environment, but a human must approve each one before it executes — mandatory human-in-the-loop (HITL), no exceptions. The agent analyzes, proposes, and stages the action; an authorized person pushes the button. Think of a remediation agent that drafts an infrastructure change and opens it for review. The approval is the security boundary, so it has to live outside the agent — an agent that approves its own actions is not Scope 2, whatever the architecture diagram says.

Scope 3: Supervised Agency

A Scope 3 system is initiated by a human but then executes autonomously — it makes contextual decisions and takes environment-modifying actions without coming back for per-action approval. The human sets the task and the boundaries; the agent runs to completion inside them. This is where most serious enterprise agent deployments in 2026 actually sit: a Claude Cowork-style deployment, where an employee assigns work and the agent executes multi-step tasks under supervision, classifies as Scope 3. AWS flags "preventing scope creep during task execution" as a core concern at this level — the boundaries you set at initiation are the whole game.

Scope 4: Full Agency

A Scope 4 system is self-initiating: it operates continuously with minimal human oversight and can start its own work based on environmental monitoring, learned patterns, or predefined conditions. Nobody clicks go. Genuinely Scope 4 systems are still rare in production — a fair number of deployments described to us as "fully autonomous" turned out to be Scope 3 jobs on a scheduler. The distinction matters because Scope 4 is where AWS's control language gets blunt: automated kill switches for runaway processes stop being advice and become table stakes.

The Six Security Dimensions, Translated for Operators

The matrix crosses the four scopes with six dimensions of control. The names are accurate but abstract — here is what each demands in practice.

Identity Context

Who is the agent acting as, and can you prove it on every call? That means authenticated agent identities, user-identity propagation through every tool invocation, and no shared service accounts standing in for "the agent." In MCP-based stacks this is the problem gateway-level auth and discovery solves — OAuth on the agent-to-server hop, session identity that survives multi-step workflows.

Data, Memory, and State Protection

Agent memory is data. Session state, scratchpads, vector stores, and conversation history inherit the classification of whatever flowed into them — and they outlive the request that created them. If your agent read a restricted document at 9am, its memory is restricted at 5pm.

Audit and Logging

Not API logs — decision trails. At Scope 2 and above you need to reconstruct why the agent did what it did: which instruction, which retrieved context, which tool output led to which action. That is agent observability. If you cannot query it per-session, you do not have it.

Agent and FM Controls

Guardrails and containment for the model itself: input and output filtering, process isolation, and — the dimension attackers probe first — prompt injection defense. Every tool result an agent reads is a potential instruction channel, and the more agency the scope grants, the more an injected instruction can do.

Agency Perimeters and Policies

The explicit boundaries of autonomous action: fixed execution paths at Scope 1, predefined action limits at Scope 2, boundaries set at initiation for Scope 3, kill switches at Scope 4. Perimeters must be enforced outside the model — a system prompt that says "do not delete records" is a wish, not a control.

Orchestration

How agents coordinate across tools, services, and other agents without security context evaporating at each hop. This is the layer where an MCP gateway earns its keep: one policy enforcement point for every tool connection instead of N bespoke integrations.

The Mapping: What You Deploy at Each Scope

Here is the part the summaries skip. Controls are cumulative — each scope inherits everything below it and adds a layer. The table shows the additions; the links carry the implementation detail.

Scope	What changes	Controls added at this scope
1 — No Agency	Agent reads and recommends	Gateway authN/authZ on every tool call; read-only tool allowlists; MCP server hardening; retrieval logging
2 — Prescribed Agency	Agent stages changes, human approves	Approval workflow outside the agent; staged-change isolation; audit trail binding each approval to each action
3 — Supervised Agency	Agent executes without per-action approval	Agency perimeters enforced at the gateway; session-scoped credentials; hardened injection defense; trajectory evaluation before autonomy widens
4 — Full Agency	Agent initiates its own work	Kill switches; anomaly detection on agent behavior; initiation budgets and rate limits; continuous evaluation gates

Hand-drawn style diagram of a cumulative control stack across the four scopes: Scope 1 adds gateway authentication, read-only allowlists, server hardening and logging; Scope 2 adds an external approval workflow and approval-to-action audit binding; Scope 3 adds enforced agency perimeters, session-scoped credentials, injection defense and trajectory evaluation; Scope 4 adds kill switches, behavioral anomaly detection and initiation budgets — with an arrow showing each scope inherits every control below it.

A few of these deserve a sentence of defense.

Gateway auth at Scope 1 surprises people — it is a read-only assistant, why the ceremony? Because Scope 1 is where identity mistakes get institutionalized. If your read-only agent runs on a shared service account today, that account will be quietly reused when someone grants it a write tool next quarter. Getting MCP gateway authentication and discovery right at Scope 1 costs a week; retrofitting it at Scope 3 costs a quarter.

Server hardening is scope-independent but stakes-dependent. A poisoned MCP server behind a Scope 1 agent leaks data; behind a Scope 3 agent it executes actions. The hardening checklist does not change between scopes, but the blast radius of skipping it grows with every level.

And the Scope 3 row leans hard on evaluation for a reason. AWS's companion post from April 2026, "Four security principles for agentic AI systems," states it outright: enforce security through deterministic, infrastructure-level controls external to the agent's reasoning loop, and expand autonomy based on demonstrated performance. An agent earns Scope 3 by passing agent evaluations at Scope 2 — autonomy is a promotion, not a launch setting.

How to Classify Your Own Agents

The classification itself takes about 30 minutes per agent if you are honest, considerably longer if you are optimistic. The procedure we use:

Inventory every tool the agent can reach, including tools exposed through connected MCP servers. Pull the live tool list from the gateway, not from the design doc.
Mark each tool read or write. A tool that sends email, creates tickets, or modifies records is a write tool even if the team calls the agent an assistant.
Identify the initiator. Does a human start every run, or can the agent trigger itself from a schedule, webhook, or monitor? Self-initiation puts Scope 4 on the table immediately.
Locate the approval gate. If a human approves each change before it executes — outside the agent — you are Scope 2. If approval happens once at kickoff, you are Scope 3.
Assign the scope and pull its control row from the table above. The gaps between the controls you have and the row you landed in are your remediation backlog, pre-prioritized.
Re-classify on every tool grant. A new write tool is a scope-change event and should trigger the same review a scope promotion would.

Nobody proposes "let's promote the chatbot to Supervised Agency." Instead, a Scope 1 assistant gains a ticket-creation tool in a sprint, a status-update tool the next month, and eighteen weeks later it is a Scope 3 system wearing Scope 1 controls. Scope creep by tool accretion is the most common finding in every classification exercise we have run. The structural fix is to route tool grants through the gateway, where adding a write tool is a reviewable policy change instead of a config edit.

What the AWS Scoping Matrix Does Not Cover

The framework is genuinely useful, and it has real gaps. Three matter in practice.

Read-only is not risk-free. The Cloud Security Alliance's December 16, 2025 enhancement, written by Ken Huang, pushes on the "No Agency" label directly: even read-only agents autonomously decide which data sources to access, which is a form of agency — and a data-exfiltration surface. Huang proposes splitting the model into two axes, data operations versus autonomy level. Adopt his 3x3 grid or not, the correction stands: treat Scope 1 retrieval scopes and logging as real controls, not formalities.

Single-agent framing. The matrix classifies one system at a time and has little to say about composite scope: a Scope 2 orchestrator that delegates to a Scope 3 sub-agent is, in effect, a Scope 3 system with a misleading label on top. Until the framework grows a composition rule, score a multi-agent deployment at the highest scope any member holds — and treat cross-organization agent-to-agent connections as out of framework entirely.

Drift over time. A scope classification is a snapshot. Six months on, the same agent holds tools nobody classified, runs on a system prompt someone has since edited, and may be sitting on a different model entirely. The matrix gives you the taxonomy but not the re-certification cadence — that part you build yourself, which is what step 6 above and the governance gates below are for.

Lining the Scoping Matrix Up with NIST AI RMF and ISO 42001

The matrix is not a compliance framework. It slots into the two frameworks your auditors actually ask about, in specific places.

NIST AI RMF. The framework's core is four functions — GOVERN, MAP, MEASURE, MANAGE. Scope classification is a MAP-function artifact: MAP is where you establish context and categorize the system for a go/no-go decision, and "this agent is Scope 3" is precisely that categorization. Evaluation gates that earn an agent more autonomy line up with MEASURE; the control stack per scope is MANAGE; the policy on who may approve a scope promotion is GOVERN.

ISO/IEC 42001. The AI management system standard, published December 2023, requires AI system impact assessments and offers 38 Annex A controls across nine objectives, selected via a Statement of Applicability. Scope level is a natural input to both: the impact assessment for a Scope 3 agent must consider autonomous-action harm a Scope 1 assessment can rule out, and the SoA justification for skipping a control gets harder as scope rises. One caution: AWS holds accredited ISO/IEC 42001 certification covering services like Amazon Bedrock — that covers the platform, not your agent. A Scope 3 deployment on a certified platform still needs its own impact assessment.

The practical pattern that falls out of this: use scope as the trigger for governance depth. Scope 1 gets a lightweight review, Scope 2 adds approval-workflow verification, Scope 3 requires a full impact assessment plus evaluation evidence, and Scope 4 goes to the top of the risk committee. The matrix gives each gate a defensible, vendor-published definition — worth a lot the first time someone disputes a classification in a steering meeting.

Frequently Asked Questions

What is the AWS Agentic AI Security Scoping Matrix?

It is a classification framework AWS published on its Security Blog on November 21, 2025 that sorts agentic AI systems into four scopes based on how much autonomous, environment-changing capability they hold, and defines six security dimensions — identity context, data/memory/state protection, audit and logging, agent and FM controls, agency perimeters, and orchestration — whose demands escalate with scope.

What are the four scopes of agency?

Scope 1 (No Agency) is human-initiated and read-only. Scope 2 (Prescribed Agency) can stage changes but requires mandatory human approval before each execution. Scope 3 (Supervised Agency) is human-initiated but executes autonomously without per-action approval. Scope 4 (Full Agency) is self-initiating, operating continuously with minimal human oversight.

How is it different from the 2023 Generative AI Security Scoping Matrix?

The 2023 matrix classified five scopes by ownership — from consuming a third-party AI app to training your own model. The 2025 agentic matrix classifies by autonomy instead, because two agents on identical models carry different risk depending on what they can do without a human. The two are complementary: one places the model, the other scopes the agency around it.

What scope is a supervised agent platform like Claude Cowork?

A Cowork-style deployment — a human assigns a task, the agent executes multi-step work autonomously under supervision, without per-action approval — is Scope 3, Supervised Agency. The controls that matter most there are agency perimeters, session-scoped credentials, injection defense, and trajectory evaluation, as covered in our Claude Cowork enterprise guide.

Which controls should a Scope 3 agent have?

Everything Scopes 1 and 2 need — gateway authentication on every tool call, hardened MCP servers, audit logging — plus enforced agency perimeters, session-scoped credentials that expire with the task, hardened prompt-injection defense, and evaluation evidence justifying the autonomy. AWS's own guidance is that autonomy should be earned through ongoing evaluation, not granted by default.

Does the matrix map to NIST AI RMF or ISO 42001?

Yes, though informally — the matrix is a categorization scheme, not a compliance standard. Scope classification functions as a MAP-function artifact under NIST AI RMF and as an input to ISO/IEC 42001 impact assessments, where scope level sets assessment depth and shapes which Annex A controls apply. AWS has not published an official crosswalk, so document your own mapping.

References

AWS Security Blog — "The Agentic AI Security Scoping Matrix: A framework for securing autonomous AI systems" (Aaron Brown and Matt Saner, November 21, 2025). https://aws.amazon.com/blogs/security/the-agentic-ai-security-scoping-matrix-a-framework-for-securing-autonomous-ai-systems/
AWS — "Securing Agentic AI: The Agentic AI Security Scoping Matrix" (framework reference page with scope examples). https://aws.amazon.com/ai/security/agentic-ai-scoping-matrix/
AWS Security Blog — "Securing generative AI: An introduction to the Generative AI Security Scoping Matrix" (Matt Saner and Mike Lapidakis, October 19, 2023). https://aws.amazon.com/blogs/security/securing-generative-ai-an-introduction-to-the-generative-ai-security-scoping-matrix/
AWS Security Blog — "Four security principles for agentic AI systems" (Mark Ryland, Riggs Goodman III, and Todd MacDermid, April 2, 2026). https://aws.amazon.com/blogs/security/four-security-principles-for-agentic-ai-systems/
Cloud Security Alliance — "Enhancing the Agentic AI Security Scoping Matrix: A Multi-Dimensional Approach" (Ken Huang, December 16, 2025). https://cloudsecurityalliance.org/blog/2025/12/16/enhancing-the-agentic-ai-security-scoping-matrix-a-multi-dimensional-approach
NIST — "Artificial Intelligence Risk Management Framework (AI RMF 1.0)," NIST AI 100-1. https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
ISO — "ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system." https://www.iso.org/standard/81230.html
AWS — "ISO/IEC 42001 FAQs" (accredited certification scope, including Amazon Bedrock). https://aws.amazon.com/compliance/iso-42001-faqs/