MCP Gateway Readiness Checklist (Security + Ops)

Use this as a build + review gate before you let any MCP-connected agent perform real actions.

1) Tool Inventory (Know what exists)
- List every MCP server/tool you expose (name, owner, repo, deployment location).
- For each tool: classify actions as READ, WRITE, or DESTRUCTIVE (delete/close/pay/deploy).
- Document required upstream permissions (OAuth scopes/roles) and which resources are in-bounds (specific repos/projects/folders).

2) Identity + Credential Handling (No shared god tokens)
- Prefer OAuth with scoped access; avoid long-lived API keys where possible.
- Store secrets in a real secrets manager (e.g., AWS Secrets Manager, HashiCorp Vault, 1Password SCIM/CLI patterns) and rotate.
- Separate credentials per environment (dev/staging/prod) and per tenant/customer.
- Define revocation: who can revoke, how fast, and how you confirm it worked.

3) Policy Controls (Make the safe path the default)
- Enforce an allowlist of tools and functions/endpoints.
- Enforce resource allowlists (exact repos, exact projects, exact folders). Don’t rely on “the model will pick the right one.”
- Require approvals for any WRITE/DESTRUCTIVE category.
- Add rate limits and concurrency caps per user and per tool to prevent runaway loops.

4) Human Approval UX (Simple beats clever)
- Show the proposed action in plain language plus the exact tool/function.
- Show the target resource identifier (repo/project/id), not just a description.
- Provide a “view diff/receipt” step for changes (PR diff, ticket fields, file patch).
- Log approver identity and timestamp.

5) Audit Logging + Forensics (Answer ‘what happened?’)
- Emit structured JSON events for every tool call: actor, client, tool, resource, approval status, result, request ID.
- Redact sensitive fields in logs (tokens, secrets, PII fields you can identify).
- Keep a retention policy that matches your customers’ expectations; make export possible (SIEM-friendly).
- Provide a way to reconstruct a run: the ordered list of tool calls and outcomes.

6) Safe Execution (Assume prompts get hostile)
- Treat all tool inputs as untrusted. Validate types and bounds.
- For tools that execute code or touch the filesystem/network: sandbox (container/VM) and restrict egress.
- Add “read-only mode” for incident response and for testing.

7) Rollout Plan (Don’t big-bang production)
- Start with one workflow, three tools, and a small user group.
- Run in shadow mode where you log proposed actions but don’t execute writes.
- Define success criteria qualitatively: fewer manual steps, fewer escalations, clearer audits.
- Create an incident playbook: disable tool, revoke credentials, export logs, notify owners.

If you can’t pass this checklist quickly, your real product isn’t an agent. It’s the missing control plane.