Agent Readiness Checklist (2026)

Use this to pressure-test whether you’re shipping a productized agent or a chat demo.

1) Workflow Definition (Outcome, Not “AI”)
- Name one workflow the agent owns end-to-end (e.g., “draft and open a support ticket with the right tags,” “prepare a PR against a repo”).
- Define the start trigger and the end condition.
- List the external systems touched (e.g., GitHub, Jira, Salesforce, Stripe, Google Workspace, Microsoft Graph).
- Define what “success” looks like in a way you can verify (diff passes tests; ticket has correct fields; email is approved before sending).

2) Permissions and Identity
- Decide whether the agent uses a service identity, delegated user identity, or both.
- Enumerate OAuth scopes / API permissions required per integration.
- Separate read scopes from write scopes. Default to read-only.
- Define role-based access in your product: who can connect integrations, who can approve writes, who can view logs.

3) Write Safety (Where Risk Lives)
- For every write action, define: propose-only vs auto-commit.
- Add approval UX for irreversible actions (send email, merge PR, change billing, modify CRM records).
- Enforce strict tool schemas (JSON schema or equivalent) and deny-by-default tool access.
- Add rate limits and budget caps per workspace/project.

4) Audit Trail and Observability
- Log every tool call: timestamp, actor (agent identity), target system, parameters (redacted as needed), result, and correlation ID.
- Store an “action ledger” customers can search.
- Add tracing across steps (retrieval → planning → tool → post-processing).
- Provide export options for enterprise buyers (at minimum, downloadable logs; ideally SIEM-friendly formats).

5) Evals and Release Discipline
- Create a small fixture set from real failure cases (bad classifications, wrong tool parameters, unsafe outputs).
- Run the eval suite on every change to prompts, tools, retrieval settings, and model versions.
- Pin model versions; define a rollback procedure.
- Add a “canary” mode for new models/prompts on a small subset of traffic or internal workspaces.

6) Data Handling and Retention
- Document what data is sent to model providers and what is stored.
- Define retention windows for traces and logs.
- Implement redaction for sensitive fields in logs.
- Ensure vector stores / caches have clear deletion semantics.

7) Incident Playbook (Assume Something Breaks)
- One-click ability to pause agent runs.
- Ability to revoke tokens and disconnect integrations.
- A documented escalation path to human operators.
- A customer-facing incident narrative: what happened, what was affected, what logs exist.

If you can’t confidently check off Permissions, Write Safety, Audit Trail, and Evals, don’t ship “autonomy.” Ship propose-and-approve with an action ledger. That’s what gets deployed.