MCP CONTROL PLANE READINESS CHECKLIST (OPS + SECURITY)

Use this checklist for any deployment where an assistant can call tools that touch company systems (SaaS apps, databases, cloud, CI/CD, ticketing, email, etc.). The goal is to make tool use attributable, least-privilege, auditable, and hard to misuse via chaining.

1) Identity and Authentication
- Define the effective identity for every tool call (human user, service role, environment, client).
- Avoid shared “god” credentials for interactive use. Prefer per-user OAuth where the target system supports it.
- Use short-lived tokens where possible; document rotation and revocation steps.
- Decide how local clients authenticate to remote MCP services (mutual TLS, signed tokens, SSO).

2) Tool Catalog and Scopes
- Create a tool inventory with owners (a named team/person) and a purpose statement per tool.
- Split tools by risk: read-only vs state-changing vs irreversible.
- Constrain arguments: allowlists for resources (projects, repos, accounts), validated formats, max result sizes.
- Explicitly disable “raw export” style tools unless there’s a documented business need.

3) Policy and Approvals
- Implement a single enforcement point before tool execution (middleware/gateway pattern).
- Require explicit approvals for high-impact actions (deploy, delete, payments, bulk email, permission changes).
- Define which roles can approve and how approvals are recorded.
- Make environment part of policy (dev vs staging vs prod).

4) Logging, Audit, and Forensics
- Log: user identity context, original user request, tool selected, arguments (sanitized), downstream response metadata.
- Store logs in an immutable or tamper-evident system with a retention policy.
- Ensure you can reconstruct a full chain across multiple tools for one user request.
- Create an operator playbook: how to investigate suspicious tool calls.

5) Data Handling and Egress Controls
- Define what sensitive data must never be returned in tool responses (secrets, access tokens, certain PII).
- Redact sensitive fields before returning results to the model/client.
- Put size limits on tool outputs to prevent accidental bulk leakage.
- Document what data can be stored in conversation history and what must be excluded.

6) Reliability and Kill Switches
- Implement timeouts, retries, and circuit breakers for downstream systems.
- Add a “disable tool” switch (per tool, per environment) for incident response.
- Rate-limit tool calls per user/client to reduce abuse and runaway loops.

7) Ownership and Change Control
- Assign an on-call owner for the MCP boundary and each critical tool.
- Require code review for new tools and any permission/policy changes.
- Maintain a changelog of tool additions, removals, and scope changes.

If you can’t pass items 1–4, don’t ship state-changing tools. Keep the assistant read-only until identity, policy, and audit are real.