AI VERIFICATION PLAYBOOK (ONE-PAGE)

Purpose
Use this to define how your team turns AI-generated drafts (code, text, configs, analyses) into verified output you can ship.

1) Pick the workflow (be specific)
- Workflow name: _______________________________
- Where it runs: (repo/service/tool) _______________________________
- What AI produces: (code, YAML, customer email, SQL, policy text) _______________________________
- Who is affected if it’s wrong: (customers, finance, security, uptime) _______________________________

2) Default risk posture (choose one)
A) Low risk (drafting only): wrong output is annoying, reversible, low blast radius.
B) Medium risk (guardrailed): wrong output can mislead users or waste time.
C) High risk (strict): wrong output can cause outages, security exposure, legal/compliance issues.
Selected posture: A / B / C

3) Proof requirements (what must exist before approval)
Citations / provenance
- For factual claims: link to source-of-truth (doc section, ticket, dashboard, contract).
- For configs: include references (module, cluster, account/project ID) and a change rationale.

Testing / validation
- Minimum checks required: _______________________________
 Examples: unit tests, integration tests, contract tests, terraform plan, kubectl dry-run, SQL sample validation query.
- “Red flag” cases that must be tested: _______________________________
 Examples: authn/authz, billing, data deletion, migrations, rate limiting.

Human gate
- Required approver(s): _______________________________
- Ownership rule: interface owners approve schema/API changes.
- Review rule: reviewers verify invariants and rollback plan, not just diff aesthetics.

Rollout / rollback
- Rollout method: (feature flag / canary / staged / manual) _______________________________
- Rollback plan written? Y/N
- Monitoring note: what metric/log proves success? _______________________________

4) Refusal rules (what AI must not do)
- Never include secrets (API keys, tokens, credentials) in prompts or outputs.
- Never claim a control exists (security/compliance) without linked evidence.
- Never send customer-facing messages without a human send action (unless explicitly approved for this workflow).
- Never run destructive operations by default (DROP, DELETE without WHERE, prod migrations) without explicit confirmation.

5) Evals (your repeatable test set)
Create a small fixed set of “hard cases”:
- Case 1: _______________________________
- Case 2: _______________________________
- Case 3: _______________________________
For each case, define “pass” in plain language (and what source must be cited).

6) Incident loop (what happens when it fails)
- Failure is logged as: (ticket type / label) _______________________________
- Post-incident requirement: add a guardrail (test, policy check, eval case, rollout change).
- Owner for guardrail implementation: _______________________________

Sign-off
- Workflow owner: __________________ Date: __________
- Security/Compliance (if needed): __________________ Date: __________
- Engineering/Product lead: __________________ Date: __________