Agentic UX Launch Kit (2026)

Use this checklist to ship a first agentic workflow that users trust and finance can justify.

1) Choose the workflow (scope)
- Pick ONE workflow with clear start/end (e.g., “refund small charges”, “triage inbound bug reports”, “prepare renewal emails”).
- Bound the action set to 3–7 tool actions max.
- Define success in a single sentence: “A run is successful when ____ happens without human edits.”
- Define blast radius: max objects touched per run (e.g., 20 records) and max dollars impacted (e.g., $25).

2) Define the delegation contract (UX)
- Delegation modes to support: Draft → Prepare → Execute w/ approval → Auto-execute (policy).
- For each mode, write what the agent may do, must ask before doing, and must show after doing.
- Require previews before mutations (diffs, affected records, recipients, amounts).

3) Guardrails (policy engine)
- Tool allowlist: which systems can be called (Salesforce, Stripe, GitHub, etc.).
- Action allowlist: which endpoints/methods are permitted.
- Hard blocks: actions never allowed (delete prod data, send to new domains, refund > limit).
- Rate limits: actions/minute and total actions/run.
- Environment separation: sandbox vs production credentials.

4) Receipts (auditability)
- Store a run envelope: run_id, user_id, agent_version, policy version, delegation mode.
- Log all tool calls (request + response metadata) and link to affected objects.
- Provide a human-readable run log in-product.
- Provide export for admins (CSV/JSON) for incident review.

5) Rollback + safety
- For every mutation, define an undo path (restore previous state or compensating action).
- Add a kill switch: stop active runs immediately.
- Add staged rollout: 5% → 25% → 100%, with automatic rollback on thresholds.

6) Metrics (ROI + reliability)
- Per-run: cost ($), success/fail reason, human intervention rate, time-to-complete.
- Downstream: cycle time (e.g., ticket resolution), error/rework rate, CSAT or incident rate.
- Targets for first 30 days: >60% completion, <10% intervention for routine tasks, <1% rollback on safe actions.

7) Launch plan
- Pilot with 10–30 power users; collect qualitative feedback weekly.
- Maintain a “policy changelog” and “agent version changelog” so behavior changes are traceable.
- Document an incident response runbook: how to pause, investigate, remediate, and communicate.

If you can’t preview it, log it, and undo it, don’t automate it.