AI MODE SPEC TEMPLATE (Copy/Paste) 1) Mode Name + One-line Promise - Mode name: - User promise (one sentence, concrete outcome): - Non-goals (what this mode will not do): 2) Entry Points + Exit Points - Where can users enter this mode? (buttons, keyboard shortcuts, API, integrations) - How does a user know they are “in mode”? (UI badge, color, banner, run state) - How does a user exit? (close, cancel run, revert to classic) 3) Scope (What it can see) - Default context: (this doc / this ticket / selected text / current repo) - Expandable context: (workspace search, connected apps like Google Drive/Slack/Jira) - External search: (allowed/blocked; if allowed, how is it labeled) - Data exclusions: (PII rules, private channels, restricted folders) - Source transparency: (citations/snippets shown? yes/no; where) 4) Authority (What it can change) - Read-only suggestions: - Write targets (draft only vs write to final): - Action targets (create ticket, send email, merge PR, deploy): - Required approvals (per action / per batch / role-based): - Rollback/undo design (what is reversible, how long, by whom): 5) Budgets + Degradation - Latency expectations (foreground vs background jobs): - Usage caps (per user / per workspace): - What happens at cap? (block, queue until reset, switch to smaller model, deterministic fallback) - What happens on timeout? (partial results, retry button, saved run state): 6) UX States (User-visible run state machine) Define the states and what the UI shows: - Idle - Collecting context - Running - Needs approval - Completed - Failed (with reason) - Canceled 7) Safety/Policy - Content policy constraints (refusals, warnings): - Permission checks (what happens if the model requests data it can’t access): - Logging policy (what is stored, retention window, exportability): - Admin controls (disable mode, restrict sources, restrict actions): 8) Operability (Required instrumentation) - Run trace fields (model ID, prompt template version, tool calls, doc IDs): - Error taxonomy (policy_denied, no_context, timeout, tool_error, model_error): - Support workflow (how support can inspect a run without seeing sensitive content): - Regression eval hook (golden tasks set; how it’s run before release): 9) Launch Criteria - What must be true before GA? (undo works, approvals enforced, citations reliable, caps enforced) - Kill switch (how to disable fast if needed): - User education (one in-product screen explaining: what it sees, what it changes) 10) Open Questions - Biggest unresolved trust risk: - Biggest unresolved cost risk: - Biggest unresolved failure mode: