Agent Production Readiness Checklist (2026) Use this to move an agent from “works in a demo” to “safe to operate in production.” Focus on controls that are testable: identity, permissions, budgets, and auditability. 1) Identity (Non-human principal) - Create a dedicated identity per agent in your IdP/IAM (Okta, Microsoft Entra ID, Google Cloud Identity, AWS IAM, etc.). No shared keys. - Assign an explicit human owner (team + on-call rotation) and document purpose and scope. - Prefer short-lived credentials (OIDC/workload identity) over long-lived API keys. - Implement rotation and revocation. Prove you can kill access quickly. - Ensure downstream systems show the agent identity in their own audit logs (GitHub, Jira, Slack, cloud provider). 2) Authorization (Tool and resource scoping) - Default-deny tool access. Create an allowlist of tools/endpoints/resources. - Separate read permissions from write permissions; keep write privileges minimal. - Add hard rules in code for forbidden actions (delete data, modify IAM, transfer funds) unless explicitly approved. - Use existing protections: GitHub branch protection, required reviews, CI checks, database RBAC. - Maintain versioned tool schemas so you can correlate behavior to a specific tool contract. 3) Budgets and quotas (Cost + blast radius) - Cap model requests/tokens per agent and per time window. - Cap tool calls per run and per day. - Cap write actions per run (PRs opened, messages sent, tickets transitioned). - Add timeouts and circuit breakers for external calls. - Define safe failure behavior: when quotas hit, the agent stops and escalates. 4) Approvals (Human-in-the-loop where it matters) - Define “protected actions” that require explicit human approval. - Approval UI must show a diff/preview of side effects (e.g., patch, email body, CRM field changes). - Log approver identity, timestamp, and what they approved. - Make approvals easy to revoke and rerun. 5) Audit trail and forensic replay - Every run gets a run_id tied to the triggering input (ticket ID/webhook/user request). - Log: model name/provider, system instructions version, tool schema version. - Log every tool call with arguments, response, timestamps, and errors. - Record external object links/IDs created or modified (PR URL, Slack permalink, Jira issue key). - Prove replay: reconstruct the full chain from input → decisions → tool calls → side effects. 6) Security and privacy hygiene - Treat tool I/O as sensitive: apply access controls, retention rules, and redaction for secrets/PII. - Scan for secret leakage (agent outputs and tool payloads). - Run adversarial tests: prompt injection in docs/tickets, ambiguous instructions, malformed tool responses. Definition of done: you can answer, precisely and quickly, “What can this agent do?”, “What did it do?”, and “How do we stop it?”