AI INCIDENT RUNBOOK TEMPLATE (v1)

Purpose
- Provide a fast, repeatable response process for AI-related failures: wrong outputs, unsafe actions, data exposure, policy violations, or brand-risk behavior.

1) Incident Classes (pick one)
- Integrity: materially incorrect output presented as reliable guidance.
- Security: prompt injection, tool abuse, credential misuse, data exfiltration.
- Privacy: exposure of user/company sensitive data via context, logs, connectors, or training feedback.
- Compliance: regulated content/retention failures, prohibited use, audit gaps.
- Reputation: toxic/biased/brand-damaging outputs likely to spread.

2) Roles (assign before you need them)
- Incident Commander (IC): runs the timeline, makes calls, keeps focus.
- Technical Lead: owns diagnosis and remediation plan.
- Comms Lead: internal updates cadence; drafts external statement if needed.
- Security/Privacy: evaluates exposure, containment, and notification needs.
- Product Owner: decides user-facing behavior changes (safe mode, feature disable).

3) Severity (define for your org)
- Sev 0: confirmed sensitive data exposure or unsafe actions in production.
- Sev 1: high-likelihood user harm, compliance breach, or major brand risk.
- Sev 2: limited impact but reproducible failure affecting key workflows.
- Sev 3: low impact, edge-case quality failures.

4) First 15 Minutes Checklist
- Create incident channel + doc; note start time.
- Capture a minimal repro (prompt, inputs, retrieved docs, tool calls).
- Identify affected surface area: which product, users, regions, and model/prompt versions.
- Decide containment NOW: disable tool, narrow retrieval scope, force safe mode, or rollback.

5) Containment Menu (pre-approve options)
- Hard rollback to known-good model/prompt/tool-policy version.
- Disable high-risk tools (email, payments, admin actions, code execution).
- Restrict retrieval to whitelisted sources; block external URLs.
- Route to human review for specified intents.
- Turn on stricter policy filters for targeted topics.

6) Communications
- Internal: update every 30–60 minutes until contained; include what changed (rollbacks/flags).
- External (if trust is impacted): state what happened, what users should do, and what you changed.
- Preserve facts; avoid speculation.

7) Postmortem Prompts (write within 48 hours)
- What was the trigger? What signals did we miss?
- What was the blast radius and why?
- Which control failed: prompt, retrieval, tool permissioning, policy gate, eval coverage, logging?
- What permanent fixes will we ship? (tests, permissions, logging, product UX, training/process)
- What will we measure going forward? (eval regressions, policy hits, user reports, tool-call anomalies)

8) Hard Requirement for “Done”
- A new automated check exists (eval, policy rule, permission boundary, or deployment gate) that would have caught or limited this incident next time.