AGENTIC WORKFLOW LAUNCH CHECKLIST (v2026)

Use this checklist to ship one agentic workflow safely and measurably. It’s designed for cross-functional use (PM, Eng, Security, Support).

1) WORKFLOW DEFINITION
- Name the workflow and write a one-sentence job statement (e.g., “Triages inbound support tickets and routes them to the correct queue”).
- Define “done” as a structured outcome (records created/updated, messages queued, tags applied).
- List explicit non-goals (what the workflow must not do).
- Identify the blast radius: what’s the worst plausible failure and its cost (time, money, trust).

2) SCOPE + PERMISSIONS
- Enumerate all tools used (read/write separated).
- For each tool: specify objects, fields, and allowed operations.
- Define role-based access: which users can run, approve, or admin policies.
- Add caps: max actions per run, max actions per day, and rate limits.

3) POLICY + GOVERNANCE
- Create default tenant policies: allowed email domains, forbidden actions, business hours.
- Define approval rules (e.g., require approval when confidence < 0.78, when emailing multiple recipients, or when touching payment links).
- Specify data handling: retention days for traces; redaction of PII; region constraints.
- Decide on audit export needs (CSV, webhook, SIEM integration).

4) VERIFICATION + FAIL-SAFES
- Implement hard validators outside the model (required fields, allowlists/denylists, cap enforcement).
- Add a second-pass verifier (model or rule-augmented) to check policy compliance before executing writes.
- Implement safe failure modes: fall back to draft-only or route to a human queue.
- Add rollback paths for every write where possible; document manual rollback steps where not.

5) OBSERVABILITY + METRICS
- Log step traces: inputs, retrieved context, tool calls, outputs, and final state changes.
- Define SLOs: job success rate, p95 latency, human intervention rate, MTTR, cost per successful run.
- Build dashboards and alerts for: success drop, retry spikes, tool errors, and cost anomalies.

6) EVALUATION (EVALS)
- Create a golden set (50–100 cases) covering common, edge, and adversarial scenarios.
- Separate language quality from action quality; action quality gates production.
- Run evals on every workflow change (CI gate). Track regressions and approvals.

7) ROLLOUT PLAN
- Dogfood internally with trace review.
- Launch to 3–10 design partners with stricter approval gating.
- Use cohort-based rollout; increase autonomy only when metrics meet target thresholds.

8) PRICING + PACKAGING
- Decide what’s included in seat pricing vs. automation add-on.
- Price around value: per run, per successful job, or tiered autonomy + governance features.
- Track gross margin by workflow (variable costs + support burden).

9) SUPPORT + INCIDENT RESPONSE
- Create a runbook: common failures, rollback steps, escalation paths.
- Define incident severity levels and comms templates for customer impact.
- Review traces weekly for new failure patterns; feed findings into policies/evals.

Exit Criteria (recommended):
- Constrained workflow success rate ≥ 95% on golden set and ≥ 90% in production cohort.
- Human intervention rate trending down week-over-week.
- Clear rollback path + MTTR under agreed SLO.
- Cost per successful run stable and within margin targets.