MCP SERVER LAUNCH CHECKLIST (PRODUCTION-GRADE)

Goal: ship an MCP server that exposes useful capabilities while keeping permissions tight, behavior predictable, and audits easy.

1) Define the server’s job (one sentence)
- What capability does this server provide (e.g., “read incident history from PagerDuty”)?
- What is explicitly out of scope (writes, admin operations, bulk export)? Write it down.

2) Inventory tools and classify risk
For each tool:
- Action type: READ / WRITE / DESTRUCTIVE
- Target systems (GitHub, Jira, Snowflake, Slack, internal services)
- Side effects (creates ticket, posts message, modifies config)
If you can’t describe side effects clearly, the tool contract is not ready.

3) Scoping and authorization
- Deny-by-default for any write tool.
- Use minimal scopes per tool (separate tokens/roles where practical).
- Add explicit allowlists for high-risk resources (projects, repos, queues, datasets).
- Decide who can invoke write tools (all engineers, on-call only, specific groups).

4) Authentication and secret handling
- Prefer short-lived credentials over long-lived API keys.
- Document rotation: who rotates, how often, and what breaks if rotation fails.
- Store secrets in a managed system (cloud secret manager or equivalent), not in repo configs.

5) Observability (non-negotiable)
Log every tool call with:
- Tool name + version
- Caller identity (user/service) and client identity (which MCP client)
- Resource identifiers (repo, project, ticket ID) — avoid dumping full content by default
- Outcome: success/denied/error + error class
- Correlation ID for tracing
Decide where logs go (central logging/SIEM) and retention expectations.

6) Safety controls
- Rate limits per caller and per tool.
- Confirmation gates for writes (explicit confirmation flag or two-step flow).
- Kill switch: the ability to disable the whole server or specific tools quickly.
- Timeouts and bounded execution (no infinite retries).

7) Tool contract quality
- Keep tools small and composable (avoid “doEverything” endpoints).
- Use structured inputs/outputs where possible (schemas).
- Version tool contracts; document breaking changes.
- Specify deterministic behavior on partial failure.

8) Data handling rules
- Decide what content may be returned (full documents vs snippets vs metadata).
- Redact sensitive fields where appropriate.
- Don’t log secrets or full sensitive payloads.

9) Rollout gates
- Start with read-only users or a pilot group.
- Add one write tool only after read paths are stable and audited.
- Require an owner/on-call for the server before broad enablement.

10) Incident playbook (yes, for an MCP server)
- How to detect misuse (alerts on unusual write volume, denied calls spiking).
- How to shut it down (kill switch procedure).
- How to audit what happened (query by correlation ID / user / tool).

If you can’t pass items 3–6, you don’t have an MCP server yet. You have a demo.