Agent Identity & Access Control Checklist (A‑IAM) Goal: Treat every production agent as an identity with a lifecycle. Use this to get from “agent with a token” to “agent with governed access” without boiling the ocean. 1) Inventory (1–2 hours) - List every agent name and purpose (support triage, data Q&A, infra automation, etc.). - Record runtime (GitHub Actions, Kubernetes, serverless, VM) and owning team. - Enumerate every tool/integration it can call: cloud APIs, DBs, Slack/Teams, email, Jira/ServiceNow, Salesforce, Stripe, internal services. - For each tool, capture the current auth method (static key, OAuth user token, service account, workload identity). 2) Identity boundaries (same day) - Create separate identities per environment: dev/staging/prod. - Create separate identities per duty: “read-only analytics” must not share credentials with “write to CRM.” - Assign a human owner per agent identity (for reviews and emergency shutdowns). 3) Credential policy (week 1) - Eliminate long-lived cloud keys on the highest-risk agents first. - Prefer OIDC/workload identity where available (cloud + CI runtimes). - If static secrets remain, store them in a secrets manager and document rotation frequency and owner. - Prohibit shared tokens across agents; shared tokens destroy auditability. 4) Authorization (week 1–2) - For each tool, define minimum required scopes/roles. - Deny-by-default for new tools: agents only call tools explicitly added to an allowlist. - Segment access by data sensitivity (customer PII vs. logs vs. public data). If you can’t segment, treat it as sensitive. 5) High-risk actions & approvals (week 2) - Define “high-risk actions” relevant to your business: refunds, permission changes, data exports, deletes, production deployments, infrastructure changes. - Add an explicit approval gate: ticket ID, change request, or a human confirmation step. - Ensure approvals are tied to an immutable action record (below). 6) Audit logging (week 2) - Emit an action record for every tool call: agent_id, requested_by (user/system), tool, resource, policy decision, approval reference, timestamp, result. - Ensure logs are queryable and retained according to your org’s policy. - Verify you can answer: “Which agent touched this customer record?” and “Who approved this action?” 7) Kill switch & offboarding (week 2–3) - Implement a single “disable agent” control (gateway switch, identity revocation, or both). - Document an on-call runbook: how to revoke tokens, rotate secrets, and disable tool access. - Schedule quarterly access reviews: agents accumulate permissions just like people. Acceptance test (the only one that matters): Pick one production agent and simulate compromise. You should be able to state (a) what it can do, (b) where that is logged, and (c) how to stop it—without guessing.