AI CONTROL PLANE STARTER CHECKLIST (ONE-WORKFLOW ROLLOUT) Goal: Take one existing AI workflow in your product and put it behind a minimal control plane: identity, policy, routing, evaluation gate, and audit-ready logs. 1) Pick the workflow (be specific) - Name the user-facing feature and the exact trigger (UI action, API call, background job). - List the actions it can take (generate text, classify, extract, call tools, write back to systems). 2) Create an “AI bill of materials” (ABOM) - Model calls: provider(s), endpoints, model names used for chat/completions, embeddings, reranking. - Data sources: documents/tables/fields accessed; include where the data originates. - Tool calls: every function/tool the workflow can invoke (CRM update, ticket create, email send, etc.). - Outputs: where generated content is shown/stored (UI, email, database), and who can see it. 3) Put a gateway in front of model calls - Standardize a single internal interface for model invocation. - Add request/response logging hooks (before you add “fancy” features). - Add a routing policy stub: even if it always routes to one model today, force the decision into configuration. 4) Define policy rules (start small, enforce hard) - Data handling: what categories are forbidden (secrets, credentials, regulated fields) and how you detect/redact. - Tool permissions: explicit allowlist; default deny for any state-changing tool. - Environment separation: dev/stage/prod keys and logs must not mix. 5) Implement evaluation as a release gate - Golden set: collect real inputs (sanitized) and expected outputs/properties. - Adversarial set: prompt injection attempts, policy violations, edge cases. - CI integration: one command that runs evals and fails the build on regressions. - Artifacts: store failing cases and traces so reviewers can inspect changes in PRs. 6) Audit & incident readiness - Decide what you retain (inputs, retrieved docs, tool calls, outputs), and what must be redacted. - Make it possible to answer: which prompt/workflow version + model produced a given output. - Define an escalation path: who gets paged when a policy violation or severe hallucination is detected. 7) Ship the smallest “controls UI” that matters - A page or internal doc that shows: routing policy, current model, eval status, and where logs live. - A runbook link: steps to disable the workflow or switch models quickly. Exit criteria (you’re done when all are true) - You can reproduce a single output: inputs (redacted), retrieval context, prompt/workflow version, model, tool calls. - You can block a class of requests via policy without code changes. - You can switch models (or fail over) via config and see the change in logs. - Evals run in CI and prevent shipping a known-bad regression. Use this checklist on one workflow first. Then repeat. Control planes are built by accumulation, not by big-bang rewrites.