ICMD Agent Readiness Scorecard (2026) Use this scorecard to move from “cool demo” to “production agent” with predictable ROI. A) Workflow Fit (0–10) 1) Clear Done State: Can you define an unambiguous completion condition (e.g., ticket closed with correct tag; refund issued with policy justification)? If not, stop. 2) Repeatability: Do you have at least 500 similar tasks/month? If <100/month, ROI is harder unless the task is high-value (security, legal). 3) Toolability: Are the required actions available via APIs (Jira, Salesforce, Stripe, Workday, ServiceNow)? If the workflow depends on manual UI clicks, budget time for RPA or integration work. 4) Error Tolerance: What’s the cost of a wrong action? If a single error can cost >$5,000 or create regulatory exposure, require stricter verification and approvals. B) Data & Context Readiness (0–10) 5) Source of Truth: List the systems of record. If more than 3 systems must be reconciled, plan a data consolidation step. 6) Retrieval Quality: Can you retrieve the right policy/docs reliably? Run 50 queries and measure: % correct doc in top-5 results. Target 85%+ before scaling. 7) Data Hygiene: Identify PII/PHI/PCI fields. Define redaction rules and minimum context. “Retrieve less” is a security and cost win. C) Safety, Governance, and Controls (0–10) 8) Identity Model: Create an agent identity per workflow. No shared keys. Use least privilege per tool. 9) Tool Gateway: Put tools behind a gateway that centralizes secrets, enforces policy, and logs every call. 10) Approvals & Thresholds: Define human approval rules (e.g., refunds >$500, customer-impacting changes, P0 incidents). 11) Auditability: Ensure replayable traces: prompt version, retrieved sources, tool calls, outputs, and final actions. D) Reliability & Cost Management (0–10) 12) Golden Set: Build 200 real examples with expected outcomes and edge cases. 13) Verifiers: Add schema validation, deterministic checks, and (for risky tasks) a second-pass verifier. 14) SLOs: Define SLOs like: 95% complete <90s; escalation rate <5%; tool call error rate <1%. 15) Cost per Completion: Track total cost per verified completion (model + tools + human review). Set a budget (e.g., <$1.50/task) and alert on anomalies. Scoring Guidance - 32–40: Ready for production write-access (with canary rollout). - 24–31: Pilot in sandbox or read-only mode; close gaps in retrieval, verification, or permissions. - <24: Not ready; redesign workflow or choose a narrower bounded scope. Rollout Template Week 1: Define contract (schemas, tools, permissions), build golden set. Week 2: Implement tool gateway + logging + verifiers; run offline eval. Week 3: Shadow mode in production (no writes); measure cost and error modes. Week 4: Canary write-access (5–10% of tasks) + weekly regression + incident runbook. If you can’t measure cost per verified completion and escalation rate, you don’t have an agent—you have a prototype.