MCP TOOL SERVER LAUNCH CHECKLIST (V1)

Goal: ship a small MCP tool surface that agents can use safely and reliably. This is not a “chat feature.” Treat it like an external interface with security, audit, and failure handling.

1) Choose the first tools
- Pick 3–5 tools tied to high-frequency user intent (the stuff users do every day).
- Prefer actions that are either read-only or easily reversible.
- Write a one-line contract for each tool: verb + object + constraints (e.g., “create_support_ticket with required customer_id and category enum”).

2) Define strict schemas
- Use explicit JSON schemas for every tool input and output.
- Make risky parameters enums (reason codes, action types) instead of free text.
- Define error shapes (invalid_input, not_authorized, not_found, conflict, rate_limited) so clients can react deterministically.

3) Authorization and scopes
- Map scopes to business actions (refund, publish, rotate_key), not generic read/write.
- Default to least privilege: per-tool, per-workspace, and ideally per-resource scopes.
- Decide how user identity is passed into tool calls and how you bind it to your existing RBAC.

4) Idempotency and retries
- Assume retries happen. Require a request_id for every mutating tool.
- Make create actions idempotent (create_or_get patterns where possible).
- Protect against duplicate side effects (double refunds, duplicate emails, duplicate provisioning).

5) Human approval path
- For irreversible or high-risk actions, require an approval tool step.
- Define who can approve (role) and what gets shown (diff, context, reason).
- Log the approval event separately from the execution event.

6) Logging and audit
- Persist tool name, actor identity, parameters (with redaction rules), result, timestamp.
- Keep an audit view an admin can export (even if it’s basic at first).
- Add correlation IDs that tie tool calls to backend events.

7) Safety and data handling
- Decide what data can be returned to the model. Redact secrets by default.
- Add server-side limits: rate limits per user/workspace; payload size limits.
- Build a “deny by policy” response that is clear and machine-readable.

8) Rollout plan
- Start with a closed beta: internal users, then design partners.
- Ship read-only tools first if your system-of-record risk is high.
- Add a kill switch: disable a tool or an entire workspace quickly.

9) Test strategy
- Unit test input validation and policy decisions.
- Integration test idempotency under retry storms.
- Create a small suite of “agent abuse” tests (invalid params, missing fields, repeated calls).

Definition of done for V1: 3–5 tools with strict schemas, enforced auth, idempotency for writes, approval gating where needed, and auditable logs—running in production with a kill switch.