Agent Readiness Checklist (2026) Use this to pressure-test whether you’re shipping a productized agent or a chat demo. 1) Workflow Definition (Outcome, Not “AI”) - Name one workflow the agent owns end-to-end (e.g., “draft and open a support ticket with the right tags,” “prepare a PR against a repo”). - Define the start trigger and the end condition. - List the external systems touched (e.g., GitHub, Jira, Salesforce, Stripe, Google Workspace, Microsoft Graph). - Define what “success” looks like in a way you can verify (diff passes tests; ticket has correct fields; email is approved before sending). 2) Permissions and Identity - Decide whether the agent uses a service identity, delegated user identity, or both. - Enumerate OAuth scopes / API permissions required per integration. - Separate read scopes from write scopes. Default to read-only. - Define role-based access in your product: who can connect integrations, who can approve writes, who can view logs. 3) Write Safety (Where Risk Lives) - For every write action, define: propose-only vs auto-commit. - Add approval UX for irreversible actions (send email, merge PR, change billing, modify CRM records). - Enforce strict tool schemas (JSON schema or equivalent) and deny-by-default tool access. - Add rate limits and budget caps per workspace/project. 4) Audit Trail and Observability - Log every tool call: timestamp, actor (agent identity), target system, parameters (redacted as needed), result, and correlation ID. - Store an “action ledger” customers can search. - Add tracing across steps (retrieval → planning → tool → post-processing). - Provide export options for enterprise buyers (at minimum, downloadable logs; ideally SIEM-friendly formats). 5) Evals and Release Discipline - Create a small fixture set from real failure cases (bad classifications, wrong tool parameters, unsafe outputs). - Run the eval suite on every change to prompts, tools, retrieval settings, and model versions. - Pin model versions; define a rollback procedure. - Add a “canary” mode for new models/prompts on a small subset of traffic or internal workspaces. 6) Data Handling and Retention - Document what data is sent to model providers and what is stored. - Define retention windows for traces and logs. - Implement redaction for sensitive fields in logs. - Ensure vector stores / caches have clear deletion semantics. 7) Incident Playbook (Assume Something Breaks) - One-click ability to pause agent runs. - Ability to revoke tokens and disconnect integrations. - A documented escalation path to human operators. - A customer-facing incident narrative: what happened, what was affected, what logs exist. If you can’t confidently check off Permissions, Write Safety, Audit Trail, and Evals, don’t ship “autonomy.” Ship propose-and-approve with an action ledger. That’s what gets deployed.