AGENT RUNBOOK SPEC TEMPLATE (ONE PAGE) 1) Workflow name - Short name: - Owner persona (job title): - Business system(s) involved: 2) Operational outcome (be concrete) - “Done” definition (what changed in the world?): - Success criteria (qualitative, not vanity metrics): - Hard constraints (must never happen): 3) Data boundaries - Allowed data sources (systems, tables, drives, repos): - Disallowed data sources: - Retention notes (what you store, for how long): 4) Action allowlist (server-enforced) List each permitted tool/action as a row: - Action name: - Read or write: - Required parameters (typed): - Preconditions (must be true before action can run): - Postconditions (what you verify after): 5) Permission model - Auth method (OAuth, service account, API key, etc.): - Least-privilege scopes/roles: - Tenant isolation approach: - How credentials are rotated / expired: 6) Approval points (make approvals cheap) For each approval gate: - Trigger (which action or risk level): - Approver role: - Evidence shown (diff, citations, impacted record IDs, preview): - Approval channel (in-app queue, Slack/Teams card, email): 7) Receipts (auditability) Define the minimum receipt for every run: - Run ID + correlation IDs: - Tool call log (action + args + timestamp + actor): - Retrieval lineage (which docs/snippets were used): - Prompt/version identifiers: - Final outputs and writes: - Human approvals/rejections: 8) Failure modes + safe behavior List likely failures and what the system does: - Rate limit / timeout: - Partial write: - Conflicting human edits: - Missing permission: - Hallucinated identifiers / nonexistent objects: - Degraded mode (read-only? ask human? stop?): 9) Rollback / staging plan - Which writes are staged (draft objects, previews): - Which writes are reversible (soft delete, compensating txn): - Who can trigger rollback: - How rollback is logged: 10) Evaluation gate (before release) - Golden test cases (small set you’ll maintain): - Regression trigger (what changes require re-test): - Acceptance criteria (what must be true to ship): - Manual review requirement (what must be sampled by a human): Use this template for one workflow. If you can’t fill it out cleanly, shrink the scope until you can.