Agents need names for the moments when the normal path should not continue. A missing source is different from a conflicting source. A tool timeout is different from a permission denial. A stale record is different from an irreversible action. A vague user request is different from a request that crosses an authority boundary. If the workflow treats all of those moments as generic failure, the agent will either stop too often, improvise too much, or hand the human a problem that has lost its shape.
An exception taxonomy is the shared vocabulary for those non-normal cases. It tells the agent, the reviewer, and the surrounding system what kind of problem has appeared and what kind of response is appropriate. The response may be retry, clarify, gather more evidence, escalate, narrow scope, request approval, checkpoint, or stop. The taxonomy does not solve every exception. It keeps exceptions from becoming a fog.
AI Agent Runbooks describe the normal operating rhythm. AI Agent Escalation Paths explain how uncertainty reaches the right person. The taxonomy sits between them. It helps the runbook say when the normal path no longer applies and helps escalation carry a clear reason rather than a vague complaint.
Exceptions Should Be Named By Cause And Consequence
The first design mistake is naming exceptions only by symptom. “Failed” is a symptom. “Blocked” is a symptom. “Cannot continue” is a symptom. Those labels may be true, but they do not tell the next actor what to do. A better label ties the cause to the consequence.
A source conflict means two or more sources disagree and the agent cannot determine which one governs. The consequence is not to pick the most convenient source. It is to preserve the conflict, prefer known authority rules when available, and escalate if authority remains unclear. A missing required field means the task lacks information needed for a safe action. The consequence may be a clarifying question or a prepared draft with the missing field marked. A permission boundary means the next useful step requires authority the agent does not have. The consequence is an approval request or stop, not a workaround.
This naming discipline supports AI Agent Output Verification . A verifier can inspect whether the agent responded to the exception correctly. Without a named exception, the final answer may only say that the agent did its best.
Clarification Is Not The Same As Escalation
Many agent workflows blur clarification and escalation. They both involve asking a person. They are not the same. Clarification asks the requester to supply missing task information, choose a preference, or resolve an ambiguity they are qualified to resolve. Escalation sends the issue to someone with authority, ownership, or risk responsibility beyond the requester.
If a user asks an agent to summarize a folder but does not name the date range, clarification may be right. If a user asks the agent to bypass an approval gate, escalation may be right. If a customer message conflicts with policy, the requester may not be the right person to decide. If a tool denies access to a restricted record, the agent should not ask the requester to paste the record into chat unless the data boundary allows it.
AI Agent Clarifying Questions focuses on asking before the run goes sideways. An exception taxonomy helps decide when a question is enough and when the issue needs a different lane. This distinction reduces both friction and risk. People are not interrupted for questions the agent could answer from context, and requesters are not asked to approve things they cannot safely approve.
Retry Exceptions Need Evidence
Some exceptions deserve retry. A transient tool timeout, a temporary rate limit, a flaky test setup, or an unavailable page may clear. Other exceptions become worse when retried. A duplicate send, a repeated charge, a second form submission, or a tool call with unclear side effects can create harm. The taxonomy should separate retryable from non-retryable conditions.
Retry labels should include what is known about side effects. A read timeout before any action is different from a timeout after submitting a state-changing request. In the second case, the next step may be confirmation of target state, not repetition. AI Agent Retries and Idempotency provides the deeper mechanics: idempotency keys, action IDs, dry runs, and stable status checks. The taxonomy gives the agent the language to choose those mechanics.
Quota and rate-limit exceptions also need care. AI Agent Quota-Aware Execution explains graceful stops under limits. The taxonomy should distinguish “wait and retry later” from “budget exhausted, checkpoint now” and from “shared quota pressure, yield to higher-priority work.” Those conditions feel similar in the moment but imply different behavior.
Stale State Is Its Own Exception
Agent work often spans time. A record can change after the agent reads it. A source can be updated after the draft is prepared. An approval can expire. A branch can move. A queue item can be claimed by another worker. If the workflow has no stale-state exception, the agent may proceed on assumptions that were true earlier and false now.
A stale-state exception says that the target no longer matches the inspected state. The right response may be revalidation, conflict handling, or a fresh approval. It should not be treated as an ordinary failure or silently ignored. This is especially important when agents use checkpoints. A resumed run must know what evidence needs rechecking before action.
AI Agent Checkpoints and AI Agent Concurrency both depend on stale-state awareness. A checkpoint preserves work, but it also freezes assumptions. A lock prevents some collisions, but not every external change. The taxonomy gives resumed work a way to say, “the old path may no longer be valid.”
Policy Conflicts Need Reviewable Records
Some exceptions involve conflict between instructions, sources, or policies. A task request may conflict with durable policy. A retrieved document may conflict with a newer source. A page may contain hostile instructions. A customer may request an action outside the support policy. An old memory may conflict with current evidence.
These cases should not be collapsed into generic uncertainty. The taxonomy should name the kind of conflict and require the agent to preserve both sides in the handoff. Which source said what? Which source appears higher in the hierarchy? What did the agent treat as evidence rather than instruction? What decision remains for a reviewer?
AI Agent Instruction Hierarchies gives the rules for keeping goals, policies, evidence, and untrusted content in order. The exception taxonomy gives the workflow a reporting surface when those layers fight. A reviewer should not have to infer that a conflict existed from a cautious-sounding paragraph.
Unknown Exceptions Should Not Become A Junk Drawer
No taxonomy can cover every case. There should be a label for unclassified exceptions, but it should be treated as temporary. If “unknown” becomes common, the taxonomy is not doing its job. Repeated unknowns should be reviewed and either given a name, routed into an existing category, or used to improve the runbook.
This connects to AI Agent Operating Metrics . Exception categories produce signals. A rise in missing-source exceptions may point to broken retrieval. A rise in permission-boundary exceptions may point to poor task routing. A rise in stale-state exceptions may point to concurrency problems. A rise in unclassified exceptions may point to a workflow moving beyond its original design.
The taxonomy should remain small enough to use. It is better to have a dozen well-understood categories than a sprawling menu no one applies consistently. The names should appear in runbooks, handoffs, traces, and review queues. They should be ordinary language, not private jargon that only the system designer understands.
The Point Is Predictable Stopping
An exception taxonomy is not an attempt to make agents timid. It is a way to make stopping, retrying, and escalating predictable. When the normal path is safe, the agent can move. When a known exception appears, it can respond according to the category. When an unknown exception appears, it can preserve the evidence and stop without pretending to have solved the case.
That predictability helps humans too. A reviewer who sees “source conflict” knows to inspect authority. A requester who sees “missing required field” knows what to provide. An operator who sees “stale target state” knows the run needs revalidation. An incident reviewer who sees “permission boundary crossed” knows the problem is not just a bad answer.
Good delegated work depends on this kind of naming. Agents do not need a dramatic story for every exception. They need a stable vocabulary that keeps the problem legible. Once the case has a name, the next action becomes easier to choose and easier to review.



