ICMD Agent Workflow PRD + Launch Readiness Checklist (2026) Use this template to ship one production-grade agent workflow in 2–6 weeks. 1) Workflow Definition (Versioned) - Workflow name + version (e.g., invoice_reconciliation_v2) - Target user persona (job title, team size, tools they use) - Job-to-be-done statement (one sentence) - In scope: exact task boundaries (what the agent will do) - Out of scope: explicit exclusions (what it will never do) - Inputs: required fields + optional fields + allowed formats - Outputs: required artifacts (records created, fields updated, files produced) - Reversibility: how to undo actions (rollback plan) 2) Success Metrics (Pick 3–5) - Verified Completion Rate (VCR): % runs that pass checks - Time-to-outcome: median + p95 minutes per run - Human Minutes Saved (HMS): baseline minus post-agent time - Escalation rate: % runs requiring human review - Cost per verified outcome: total run cost / verified completions 3) Permissions & Guardrails - Least-authority role definition (what systems, read vs write) - Approval points (what requires human confirm) - Policy checks (rules engine, thresholds, required citations) - Data handling: PII redaction, retention window, export controls - Audit log fields: run_id, actor, tools called, data sources, changes made 4) Evaluation Plan (Before Launch) - Golden set: 50–300 representative cases (label source, freshness date) - Scoring dimensions: correctness, groundedness/citations, format compliance - Regression gates: define “no-ship” thresholds (e.g., VCR drop >2%) - Drift monitoring cadence: weekly review + monthly refresh of golden set 5) Observability & Ops - Run dashboard: VCR, error rate, latency p95, cost/run, retries - Incident playbook: how to pause workflow, roll back, notify customers - Support workflow: how CS accesses run history and exports evidence 6) Pricing & Packaging - Value hypothesis: hours saved or revenue protected per month - Pricing model: platform fee + per verified outcome OR tiered usage buckets - Limiters: max runs/day, max write actions, premium for high-stakes modes - ROI report: monthly customer summary metrics to defend renewal 7) Launch Plan (2-stage) - Stage 1 (Beta): read-only or reversible actions; heavy logging; opt-in users - Stage 2 (GA): expanded permissions; published SLA; documented controls - Post-launch: weekly eval review; ship workflow versioning + changelog If you can’t answer one item above with a specific number, policy, or owner, you’re not ready to scale the workflow—yet.