AI INCIDENT RUNBOOK TEMPLATE (v1) Purpose - Provide a fast, repeatable response process for AI-related failures: wrong outputs, unsafe actions, data exposure, policy violations, or brand-risk behavior. 1) Incident Classes (pick one) - Integrity: materially incorrect output presented as reliable guidance. - Security: prompt injection, tool abuse, credential misuse, data exfiltration. - Privacy: exposure of user/company sensitive data via context, logs, connectors, or training feedback. - Compliance: regulated content/retention failures, prohibited use, audit gaps. - Reputation: toxic/biased/brand-damaging outputs likely to spread. 2) Roles (assign before you need them) - Incident Commander (IC): runs the timeline, makes calls, keeps focus. - Technical Lead: owns diagnosis and remediation plan. - Comms Lead: internal updates cadence; drafts external statement if needed. - Security/Privacy: evaluates exposure, containment, and notification needs. - Product Owner: decides user-facing behavior changes (safe mode, feature disable). 3) Severity (define for your org) - Sev 0: confirmed sensitive data exposure or unsafe actions in production. - Sev 1: high-likelihood user harm, compliance breach, or major brand risk. - Sev 2: limited impact but reproducible failure affecting key workflows. - Sev 3: low impact, edge-case quality failures. 4) First 15 Minutes Checklist - Create incident channel + doc; note start time. - Capture a minimal repro (prompt, inputs, retrieved docs, tool calls). - Identify affected surface area: which product, users, regions, and model/prompt versions. - Decide containment NOW: disable tool, narrow retrieval scope, force safe mode, or rollback. 5) Containment Menu (pre-approve options) - Hard rollback to known-good model/prompt/tool-policy version. - Disable high-risk tools (email, payments, admin actions, code execution). - Restrict retrieval to whitelisted sources; block external URLs. - Route to human review for specified intents. - Turn on stricter policy filters for targeted topics. 6) Communications - Internal: update every 30–60 minutes until contained; include what changed (rollbacks/flags). - External (if trust is impacted): state what happened, what users should do, and what you changed. - Preserve facts; avoid speculation. 7) Postmortem Prompts (write within 48 hours) - What was the trigger? What signals did we miss? - What was the blast radius and why? - Which control failed: prompt, retrieval, tool permissioning, policy gate, eval coverage, logging? - What permanent fixes will we ship? (tests, permissions, logging, product UX, training/process) - What will we measure going forward? (eval regressions, policy hits, user reports, tool-call anomalies) 8) Hard Requirement for “Done” - A new automated check exists (eval, policy rule, permission boundary, or deployment gate) that would have caught or limited this incident next time.