DECISION SYSTEM SPEC — ONE-PAGE TEMPLATE 1) Decision Owned (be specific) - Name: - Single-sentence description (verb + object): - Output type: (approve/deny, create/update record, route, draft + publish, etc.) - Systems affected (systems-of-record): 2) Authority Boundary (what it is allowed to do) - Allowed actions (typed): - - - Disallowed actions: - - - Scope constraints: - Tenants/projects: - Rate limits / quotas: - Time windows (if relevant): - Required approvals (if any): - Action -> approver role: 3) Evidence Requirements (what it must check before acting) - Required sources-of-truth (must be machine-readable): - - - Required fields (must be present): - - - Evidence rendering in UI (receipts): - Links/IDs shown to reviewer: - Diffs/screenshots/log snippets: 4) Escalation Triggers (when it must stop and ask) - Missing or conflicting data - Policy conflict - Ambiguous user instruction - External dependency not reachable (API down) - Any detected anomaly you define (e.g., unusual amount, new vendor, new device) - Define the escalation destination: - Queue/channel: - On-call/role: - SLA: 5) Rollback + Reversibility - What “undo” looks like for each allowed action: - Action -> rollback procedure: - Data retention requirements for rollback: - User-visible messaging after rollback: 6) Policies + Authorization - Policy engine/location (e.g., OPA/Cedar/custom): - Who can grant the system permissions: - Secret handling (where tokens live, rotation, blast radius): 7) Observability + Audit - Required audit events (minimum): - proposed_action - evidence_collected - policy_check_result - tool_call (inputs redacted as needed) - state_change_committed - escalation_created - Correlation ID strategy (trace across steps): - Redaction rules (PII, credentials): 8) Evaluation Plan (treat as regression tests) - Scenario suite list (10–30 real tasks): - Edge cases you expect: - Tool failure simulations: - Acceptance criteria (qualitative is fine): - Release gate (what must pass before deploy): 9) User Experience - Where the system acts autonomously vs asks for approval: - What the user can edit: - How receipts are displayed: - How users report mistakes (feedback loop): 10) Kill Switch - Who can disable it: - What happens when disabled (fallback behavior): - Communication plan for affected users: Use this template as a forcing function: if you can’t fill a section with concrete, testable statements, you’re not ready to ship autonomy. Tighten the decision or reduce authority until you can.