AGENTIC WORKFLOW LAUNCH CHECKLIST (v2026) Use this checklist to ship one agentic workflow safely and measurably. It’s designed for cross-functional use (PM, Eng, Security, Support). 1) WORKFLOW DEFINITION - Name the workflow and write a one-sentence job statement (e.g., “Triages inbound support tickets and routes them to the correct queue”). - Define “done” as a structured outcome (records created/updated, messages queued, tags applied). - List explicit non-goals (what the workflow must not do). - Identify the blast radius: what’s the worst plausible failure and its cost (time, money, trust). 2) SCOPE + PERMISSIONS - Enumerate all tools used (read/write separated). - For each tool: specify objects, fields, and allowed operations. - Define role-based access: which users can run, approve, or admin policies. - Add caps: max actions per run, max actions per day, and rate limits. 3) POLICY + GOVERNANCE - Create default tenant policies: allowed email domains, forbidden actions, business hours. - Define approval rules (e.g., require approval when confidence < 0.78, when emailing multiple recipients, or when touching payment links). - Specify data handling: retention days for traces; redaction of PII; region constraints. - Decide on audit export needs (CSV, webhook, SIEM integration). 4) VERIFICATION + FAIL-SAFES - Implement hard validators outside the model (required fields, allowlists/denylists, cap enforcement). - Add a second-pass verifier (model or rule-augmented) to check policy compliance before executing writes. - Implement safe failure modes: fall back to draft-only or route to a human queue. - Add rollback paths for every write where possible; document manual rollback steps where not. 5) OBSERVABILITY + METRICS - Log step traces: inputs, retrieved context, tool calls, outputs, and final state changes. - Define SLOs: job success rate, p95 latency, human intervention rate, MTTR, cost per successful run. - Build dashboards and alerts for: success drop, retry spikes, tool errors, and cost anomalies. 6) EVALUATION (EVALS) - Create a golden set (50–100 cases) covering common, edge, and adversarial scenarios. - Separate language quality from action quality; action quality gates production. - Run evals on every workflow change (CI gate). Track regressions and approvals. 7) ROLLOUT PLAN - Dogfood internally with trace review. - Launch to 3–10 design partners with stricter approval gating. - Use cohort-based rollout; increase autonomy only when metrics meet target thresholds. 8) PRICING + PACKAGING - Decide what’s included in seat pricing vs. automation add-on. - Price around value: per run, per successful job, or tiered autonomy + governance features. - Track gross margin by workflow (variable costs + support burden). 9) SUPPORT + INCIDENT RESPONSE - Create a runbook: common failures, rollback steps, escalation paths. - Define incident severity levels and comms templates for customer impact. - Review traces weekly for new failure patterns; feed findings into policies/evals. Exit Criteria (recommended): - Constrained workflow success rate ≥ 95% on golden set and ≥ 90% in production cohort. - Human intervention rate trending down week-over-week. - Clear rollback path + MTTR under agreed SLO. - Cost per successful run stable and within margin targets.