AGENTIC UI SPEC STARTER (PLAN / PROOF / PLAYBACK)

Use this to spec one AI workflow end-to-end. Keep it strict. If you can’t answer a prompt with a product decision, you’re not ready to automate it.

1) Workflow Definition
- Name the workflow:
- User job-to-be-done (one sentence):
- Systems touched (internal + external):
- State changes (create/update/delete/notify):
- “Manual” deterministic path (link to existing UI or describe what you’ll build):

2) Plan Panel (What will happen)
- Plan output format: checklist / diff / batch list / API call list
- Editable fields the user can change (not by re-prompting):
- Batch behavior: per-item toggles? per-item notes?
- Approval UX:
 - Single Run button? Multi-step confirmations?
 - What triggers a higher-friction approval?

3) Proof Panel (Why this is the plan)
- Evidence types shown:
 - Records read (IDs + links):
 - Policies applied (named rules):
 - Retrieval sources (doc links + timestamps if available):
- What you will NOT show (and why):
- If evidence is missing, what action is disabled?

4) Playback Panel (What actually happened)
- Timeline events you will log and display:
 - Plan generated
 - Approval captured (who/when)
 - Tool calls (tool name, inputs summary)
 - Writes (record IDs, before/after if possible)
 - Errors + retries
- Export requirements: CSV/JSON? Admin-only?
- Debug UX: how a human re-runs a failed step safely

5) Identity, Permissions, and Scope
- Acting identity: user / service account / dedicated agent identity
- Required roles/permissions:
- Tenant/workspace scoping rules:
- Least-privilege tool access list (explicit allowlist):

6) Budgets and Limits (Blast Radius)
- Max records per run:
- Max notifications/emails per run:
- Rate limits:
- Timeouts:
- “Kill switch” location (UI + admin):

7) Rollback / Reversibility
- What can be undone automatically:
- What requires compensating actions:
- How the user initiates rollback:
- Data versioning approach (if any):

8) Failure Modes (Write them down)
- Top 5 likely failures (data changed mid-run, missing permission, tool outage, ambiguous targets, etc.):
- For each: user-facing message + operator-facing log detail

Definition of Done
- A user can: preview, edit, approve, and review outcomes without reading a chat transcript.
- An operator can: explain any change by inspecting Playback and exported logs.
- The workflow still works via the deterministic escape hatch when the model is unavailable.