AGENTIC UI SPEC STARTER (PLAN / PROOF / PLAYBACK) Use this to spec one AI workflow end-to-end. Keep it strict. If you can’t answer a prompt with a product decision, you’re not ready to automate it. 1) Workflow Definition - Name the workflow: - User job-to-be-done (one sentence): - Systems touched (internal + external): - State changes (create/update/delete/notify): - “Manual” deterministic path (link to existing UI or describe what you’ll build): 2) Plan Panel (What will happen) - Plan output format: checklist / diff / batch list / API call list - Editable fields the user can change (not by re-prompting): - Batch behavior: per-item toggles? per-item notes? - Approval UX: - Single Run button? Multi-step confirmations? - What triggers a higher-friction approval? 3) Proof Panel (Why this is the plan) - Evidence types shown: - Records read (IDs + links): - Policies applied (named rules): - Retrieval sources (doc links + timestamps if available): - What you will NOT show (and why): - If evidence is missing, what action is disabled? 4) Playback Panel (What actually happened) - Timeline events you will log and display: - Plan generated - Approval captured (who/when) - Tool calls (tool name, inputs summary) - Writes (record IDs, before/after if possible) - Errors + retries - Export requirements: CSV/JSON? Admin-only? - Debug UX: how a human re-runs a failed step safely 5) Identity, Permissions, and Scope - Acting identity: user / service account / dedicated agent identity - Required roles/permissions: - Tenant/workspace scoping rules: - Least-privilege tool access list (explicit allowlist): 6) Budgets and Limits (Blast Radius) - Max records per run: - Max notifications/emails per run: - Rate limits: - Timeouts: - “Kill switch” location (UI + admin): 7) Rollback / Reversibility - What can be undone automatically: - What requires compensating actions: - How the user initiates rollback: - Data versioning approach (if any): 8) Failure Modes (Write them down) - Top 5 likely failures (data changed mid-run, missing permission, tool outage, ambiguous targets, etc.): - For each: user-facing message + operator-facing log detail Definition of Done - A user can: preview, edit, approve, and review outcomes without reading a chat transcript. - An operator can: explain any change by inspecting Playback and exported logs. - The workflow still works via the deterministic escape hatch when the model is unavailable.