ICMD Agent Production Readiness Checklist (2026) Use this checklist to take an AI agent from pilot to production without getting trapped in demo-ware. 1) Pick the right first workflow (scope) - Choose a single “unit of work” (one ticket, one invoice, one alert) with a clear definition of DONE. - Confirm high frequency: at least 500–5,000 units/month so learning compounds. - Bound the downside: no irreversible actions at launch (payments, deletions, account closures). - Identify the owner with budget and KPIs (Controller, Head of Support, SecOps lead). 2) Define success metrics (outcomes + reliability) - Outcome KPI: e.g., backlog reduction %, time-to-close, days-to-close books. - Reliability KPI: task success rate (TSR), tool-call error rate, silent failure rate. - Human effort KPI: interventions per 100 tasks; median review time. - Cost KPI: cost-to-serve per unit and inference % of revenue. 3) Build guardrails before autonomy - Least-privilege permissions (scoped OAuth, RBAC). - Hard limits: max tool calls, max dollar cost per task, timeouts. - Approval workflow for any write action; step-up verification for sensitive steps. - Escalation rules: low confidence threshold and clear routing to humans. 4) Instrument everything (audit + observability) - Structured logs per task: inputs, tool calls, outputs, policy decisions. - Tracing for latency and cost; dashboards for failure taxonomy. - Exportable audit reports for compliance/security reviews. 5) Evaluation and rollout plan - Build an eval set from real history (200–1,000 cases) and keep it versioned. - Add a gating process: no prompt/policy/model changes without passing evals. - Roll out in phases: shadow mode → assist mode (human approve) → partial autonomy. - Define rollback: ability to disable tools or revert versions in minutes. 6) Pricing and packaging (make it legible) - Set pricing around outcomes (per resolved unit) plus a platform fee. - Publish what counts as a billable unit and what doesn’t (retries, failed tasks). - Align margins: target inference+tooling under 15–25% of revenue. 7) Security & compliance readiness - Data retention policy and customer controls. - Clear separation of customer data across tenants. - Incident response process (postmortems, SLAs, escalation contacts). Exit criteria for “production” - ≥90% TSR on low-risk workflows OR predictable safe escalation. - <1% silent failures. - 100% action logging with export. - Clear unit economics and a defined operator owner.