Agent UX & Governance Launch Checklist (2026)

Use this checklist before you ship any agent that reads or writes to systems-of-record (CRM, ticketing, code, finance, messaging). The goal: predictable outcomes, defensible governance, and controllable costs.

1) Workflow scope (contract)
- Name one workflow and write its boundaries (what it does / does not do).
- Enumerate allowed tools + actions (read, draft, update, send, delete).
- Define hard stops: prohibited data types (PII/PHI), prohibited destinations (external email), and prohibited bulk actions.
- Decide the “output artifact” (draft email, PR diff, ticket, record update) that users can inspect.

2) Task system primitives
- Every run has: task_id, owner, timestamps, state machine (queued/running/needs-approval/completed/failed).
- Store intermediate artifacts (drafts/diffs/previews) separate from the chat transcript.
- Provide a user-visible receipt: sources accessed, actions taken, approvals, and record IDs touched.

3) Governance & blast radius
- Implement RBAC for: run agent, approve actions, edit policies, view logs.
- Add blast radius labels: read-only vs write; single vs bulk; internal vs external.
- Require previews for any write. Require approvals for high blast radius actions.
- Build a kill switch: disable agent globally; disable per-tool (e.g., “no sends”).

4) Observability & audit
- Capture tool call logs with deterministic identifiers (record IDs, PR numbers, ticket IDs).
- Set retention defaults (e.g., 30/180 days) and document them.
- Provide export to customer systems (SIEM/log storage) if enterprise.
- Define an incident playbook: how to reconstruct a run, revoke access, and notify admins.

5) Cost envelope & pricing readiness
- Set p50/p95 cost targets per successful task (inference + tool costs).
- Add routing: small model for intent/gating; escalate to larger model when required.
- Add caching for repeated, verified answers (policy/onboarding/troubleshooting).
- Decide pricing alignment: seat + fair use, usage-based, or tiered workflows.

6) Success metrics (instrument before GA)
- Track: completion rate, escalation rate, rollback rate (24h), edit distance, cost per successful task.
- Add “time-to-first-proof” (how quickly users see a verifiable preview/citation).
- Define success thresholds for ring-by-ring rollout (internal → design partners → beta → GA).

Launch rule of thumb: If you can’t show (a) what the agent did, (b) who approved it, (c) how to undo it, and (d) what it cost at p95, you’re not ready for production.