Audit-Grade LLM Feature Launch Checklist (2026) Use this before you ship an LLM feature that touches customer data or internal systems. 1) Define the system boundary - Write a one-paragraph spec: what the model is allowed to do, and what it must never do. - Decide the default mode: read-only vs can mutate data. - List the connected systems (Slack, Google Drive, Confluence, GitHub, CRM, billing, ticketing). 2) Retrieval you can trust - Identify sources of truth and owners (who is on the hook when docs are wrong?). - Implement ingestion as a pipeline (idempotent runs, backfills, alerts). - Enforce authorization at query time (document-level ACLs); don’t rely on “it’s in the vector DB.” - Create citation requirements for high-stakes answers (show doc IDs/links). - Add deletion support (user requests, retention rules, offboarding). 3) Tool calling as an API platform - Define strict schemas for every tool input; validate server-side. - Separate read tools from write tools; gate write tools behind flags/approvals. - Make tool calls idempotent where possible; add rate limits and retries. - Log tool args and results with correlation IDs (treat as sensitive data). 4) Observability and replay - Log: model/provider/version, temperature/top_p, system prompt, user prompt, retrieved doc IDs, tool calls. - Ensure you can replay a production incident in a safe environment (redacted data, same prompts, same model). - Add dashboards for: tool error rate, retrieval empty-hit rate, refusal/safety-trigger rate. 5) Evals before customization - Build a golden set from real tasks (support tickets, internal Q&A, workflows). - Add regression tests for your worst failures (prompt injection attempts, permission edge cases, outdated policy questions). - Track changes by model version and prompt version. 6) Security review essentials - Threat model prompt injection and data exfiltration for every connector. - Redact secrets in logs; define retention and access controls for traces. - Confirm vendor terms and data handling for your LLM provider. Launch gate - You can answer: “Why did it say this?” with citations and traces. - You can answer: “What data could it access?” with a permission map. - You can roll back: model version, prompt version, retrieval index version.