Production Agent Authority Checklist (v1)

Use this checklist before you claim your product is an “agent” (not a chat demo). The goal: an agent that can take real actions while staying safe, inspectable, and revocable.

1) Define the agent’s identity
- Create a dedicated agent identity per customer tenant/workspace (not a shared API key).
- Document where credentials live (KMS/secret manager) and how rotation works.
- Provide a one-click revoke path for admins.
- Decide whether you support SSO/SAML/SCIM for human approvers now or later; don’t pretend it won’t matter.

2) Scope permissions to least privilege
- List every action the agent can take as explicit verbs (e.g., create_ticket, refund_payment, merge_pr).
- Map each verb to an allowlist of tools/APIs and required scopes.
- Split dev/staging/prod access. Default new installs to lowest-risk environments.
- Add rate limits and resource limits (per project, per repo, per workspace).

3) Build an approval model people will use
- Categorize actions by risk: low (auto), medium (notify), high (approve).
- Make approvals happen in existing workflows (Slack/Teams, Jira/ServiceNow, email) rather than only your UI.
- Include “approve with edits” for common parameters (amount, date, target environment).
- Log who approved, when, and what changed.

4) Make execution deterministic at the boundary
- Enforce JSON schema validation for every tool call.
- Add policy checks before execution (actor, environment, resource, time window).
- Prefer idempotent operations; add idempotency keys where possible.
- Provide dry-run/plan mode for high-impact operations (Terraform-style plan/apply).

5) Verify outcomes, not just attempts
- After tool execution, run verification queries (did the ticket get created? did the deploy actually roll back?).
- Define a rollback or compensating action for your top high-risk actions.
- Escalate to a human with full context when verification fails.

6) Treat the audit log as a product feature
- Record: user intent, model plan, tool calls (args + responses), diffs, approvals, and verification results.
- Make logs exportable for security teams (SIEM-friendly formats).
- Provide retention controls and deletion policies that match your customer’s needs.

7) Draw hard data boundaries
- Be explicit about what data is sent to third-party model providers.
- Support redaction for sensitive fields (tokens, keys, PII) before model calls.
- Separate customer tenants at every layer (storage, logs, caches).

Exit criteria (don’t ship without these)
- A customer admin can answer: “What can the agent do, exactly?”
- A customer admin can revoke access instantly without filing a ticket.
- Every high-risk action is gated by approval and produces a verifiable audit trail.
- The agent fails safely (no partial writes without logs, no silent retries that hide damage).