Context & Tool Governance Checklist (2026) Use this as a pre-flight before you ship (or refactor) an LLM feature. The goal: fewer moving parts, clearer contracts, and fast debugging. 1) Define the job - Write the single sentence “job to be done” for the model (not the user story). - List the top 3 failure modes that would be unacceptable (e.g., wrong entitlement, unauthorized disclosure, irreversible action). 2) Choose your source of truth - For each required fact, mark it as: API/DB field, computed value, or prose document. - If it’s API/DB: do not use retrieval as the first option. Plan a tool call. - If it’s prose: decide whether you need citations. If yes, plan stable excerpt IDs. 3) Context assembly (deterministic) - Specify allowed sources by type (e.g., Jira tickets, GitHub issues, internal policy pages). - Define an ordering and structure (headers, sections, and a fixed template). - Include provenance for every included snippet (URL/record ID + timestamp/version). - Add a hard cap for context size and a rule for what gets dropped first. 4) Retrieval (only if it earns its keep) - Define what retrieval is allowed to search (and what it must never search). - Use metadata filters for permissions, but do not rely on them as your only control. - Treat retrieved text as untrusted input; strip instruction-like patterns where possible. - Create a small regression set of queries that must keep working as the corpus changes. 5) Tool contracts (strict) - Every tool has: name, purpose, schema, and explicit permission checks outside the model. - Validate tool inputs with a schema validator before execution. - Prefer idempotent tools; for non-idempotent actions, require confirmation. - Record tool outputs and errors in traces; never hide failures behind a natural-language apology. 6) Guardrails that actually work - Add refusal criteria tied to missing required fields or insufficient evidence. - Constrain outputs: structured JSON for actions, templated formats for reports. - Separate “analysis” from “final answer” in your internal prompting; log both if your policy allows. 7) Tracing & auditability - Log: assembled context, model config, prompt version, tool calls, tool outputs, and final response. - Make a single run reconstructible from an ID. - Define retention and redaction rules (PII, secrets, credentials). 8) Evaluation in CI - Build a small suite: correctness checks, permission checks, injection tests, and refusal tests. - Pin prompt/context templates by version; review changes like code. Exit criteria (ship gate) - You can explain one wrong answer end-to-end using traces. - You can prove the model didn’t access unauthorized data. - You can disable a tool or a data source without breaking the whole feature.