MCP TOOL SERVER LAUNCH CHECKLIST (V1) Goal: ship a small MCP tool surface that agents can use safely and reliably. This is not a “chat feature.” Treat it like an external interface with security, audit, and failure handling. 1) Choose the first tools - Pick 3–5 tools tied to high-frequency user intent (the stuff users do every day). - Prefer actions that are either read-only or easily reversible. - Write a one-line contract for each tool: verb + object + constraints (e.g., “create_support_ticket with required customer_id and category enum”). 2) Define strict schemas - Use explicit JSON schemas for every tool input and output. - Make risky parameters enums (reason codes, action types) instead of free text. - Define error shapes (invalid_input, not_authorized, not_found, conflict, rate_limited) so clients can react deterministically. 3) Authorization and scopes - Map scopes to business actions (refund, publish, rotate_key), not generic read/write. - Default to least privilege: per-tool, per-workspace, and ideally per-resource scopes. - Decide how user identity is passed into tool calls and how you bind it to your existing RBAC. 4) Idempotency and retries - Assume retries happen. Require a request_id for every mutating tool. - Make create actions idempotent (create_or_get patterns where possible). - Protect against duplicate side effects (double refunds, duplicate emails, duplicate provisioning). 5) Human approval path - For irreversible or high-risk actions, require an approval tool step. - Define who can approve (role) and what gets shown (diff, context, reason). - Log the approval event separately from the execution event. 6) Logging and audit - Persist tool name, actor identity, parameters (with redaction rules), result, timestamp. - Keep an audit view an admin can export (even if it’s basic at first). - Add correlation IDs that tie tool calls to backend events. 7) Safety and data handling - Decide what data can be returned to the model. Redact secrets by default. - Add server-side limits: rate limits per user/workspace; payload size limits. - Build a “deny by policy” response that is clear and machine-readable. 8) Rollout plan - Start with a closed beta: internal users, then design partners. - Ship read-only tools first if your system-of-record risk is high. - Add a kill switch: disable a tool or an entire workspace quickly. 9) Test strategy - Unit test input validation and policy decisions. - Integration test idempotency under retry storms. - Create a small suite of “agent abuse” tests (invalid params, missing fields, repeated calls). Definition of done for V1: 3–5 tools with strict schemas, enforced auth, idempotency for writes, approval gating where needed, and auditable logs—running in production with a kill switch.