Agent UX & Governance Launch Checklist (2026) Use this checklist before you ship any agent that reads or writes to systems-of-record (CRM, ticketing, code, finance, messaging). The goal: predictable outcomes, defensible governance, and controllable costs. 1) Workflow scope (contract) - Name one workflow and write its boundaries (what it does / does not do). - Enumerate allowed tools + actions (read, draft, update, send, delete). - Define hard stops: prohibited data types (PII/PHI), prohibited destinations (external email), and prohibited bulk actions. - Decide the “output artifact” (draft email, PR diff, ticket, record update) that users can inspect. 2) Task system primitives - Every run has: task_id, owner, timestamps, state machine (queued/running/needs-approval/completed/failed). - Store intermediate artifacts (drafts/diffs/previews) separate from the chat transcript. - Provide a user-visible receipt: sources accessed, actions taken, approvals, and record IDs touched. 3) Governance & blast radius - Implement RBAC for: run agent, approve actions, edit policies, view logs. - Add blast radius labels: read-only vs write; single vs bulk; internal vs external. - Require previews for any write. Require approvals for high blast radius actions. - Build a kill switch: disable agent globally; disable per-tool (e.g., “no sends”). 4) Observability & audit - Capture tool call logs with deterministic identifiers (record IDs, PR numbers, ticket IDs). - Set retention defaults (e.g., 30/180 days) and document them. - Provide export to customer systems (SIEM/log storage) if enterprise. - Define an incident playbook: how to reconstruct a run, revoke access, and notify admins. 5) Cost envelope & pricing readiness - Set p50/p95 cost targets per successful task (inference + tool costs). - Add routing: small model for intent/gating; escalate to larger model when required. - Add caching for repeated, verified answers (policy/onboarding/troubleshooting). - Decide pricing alignment: seat + fair use, usage-based, or tiered workflows. 6) Success metrics (instrument before GA) - Track: completion rate, escalation rate, rollback rate (24h), edit distance, cost per successful task. - Add “time-to-first-proof” (how quickly users see a verifiable preview/citation). - Define success thresholds for ring-by-ring rollout (internal → design partners → beta → GA). Launch rule of thumb: If you can’t show (a) what the agent did, (b) who approved it, (c) how to undo it, and (d) what it cost at p95, you’re not ready for production.