AI MODE SPEC TEMPLATE (Copy/Paste)

1) Mode Name + One-line Promise
- Mode name:
- User promise (one sentence, concrete outcome):
- Non-goals (what this mode will not do):

2) Entry Points + Exit Points
- Where can users enter this mode? (buttons, keyboard shortcuts, API, integrations)
- How does a user know they are “in mode”? (UI badge, color, banner, run state)
- How does a user exit? (close, cancel run, revert to classic)

3) Scope (What it can see)
- Default context: (this doc / this ticket / selected text / current repo)
- Expandable context: (workspace search, connected apps like Google Drive/Slack/Jira)
- External search: (allowed/blocked; if allowed, how is it labeled)
- Data exclusions: (PII rules, private channels, restricted folders)
- Source transparency: (citations/snippets shown? yes/no; where)

4) Authority (What it can change)
- Read-only suggestions:
- Write targets (draft only vs write to final):
- Action targets (create ticket, send email, merge PR, deploy):
- Required approvals (per action / per batch / role-based):
- Rollback/undo design (what is reversible, how long, by whom):

5) Budgets + Degradation
- Latency expectations (foreground vs background jobs):
- Usage caps (per user / per workspace):
- What happens at cap? (block, queue until reset, switch to smaller model, deterministic fallback)
- What happens on timeout? (partial results, retry button, saved run state):

6) UX States (User-visible run state machine)
Define the states and what the UI shows:
- Idle
- Collecting context
- Running
- Needs approval
- Completed
- Failed (with reason)
- Canceled

7) Safety/Policy
- Content policy constraints (refusals, warnings):
- Permission checks (what happens if the model requests data it can’t access):
- Logging policy (what is stored, retention window, exportability):
- Admin controls (disable mode, restrict sources, restrict actions):

8) Operability (Required instrumentation)
- Run trace fields (model ID, prompt template version, tool calls, doc IDs):
- Error taxonomy (policy_denied, no_context, timeout, tool_error, model_error):
- Support workflow (how support can inspect a run without seeing sensitive content):
- Regression eval hook (golden tasks set; how it’s run before release):

9) Launch Criteria
- What must be true before GA? (undo works, approvals enforced, citations reliable, caps enforced)
- Kill switch (how to disable fast if needed):
- User education (one in-product screen explaining: what it sees, what it changes)

10) Open Questions
- Biggest unresolved trust risk:
- Biggest unresolved cost risk:
- Biggest unresolved failure mode: