Agent Identity & Access Control Checklist (A‑IAM)

Goal: Treat every production agent as an identity with a lifecycle. Use this to get from “agent with a token” to “agent with governed access” without boiling the ocean.

1) Inventory (1–2 hours)
- List every agent name and purpose (support triage, data Q&A, infra automation, etc.).
- Record runtime (GitHub Actions, Kubernetes, serverless, VM) and owning team.
- Enumerate every tool/integration it can call: cloud APIs, DBs, Slack/Teams, email, Jira/ServiceNow, Salesforce, Stripe, internal services.
- For each tool, capture the current auth method (static key, OAuth user token, service account, workload identity).

2) Identity boundaries (same day)
- Create separate identities per environment: dev/staging/prod.
- Create separate identities per duty: “read-only analytics” must not share credentials with “write to CRM.”
- Assign a human owner per agent identity (for reviews and emergency shutdowns).

3) Credential policy (week 1)
- Eliminate long-lived cloud keys on the highest-risk agents first.
- Prefer OIDC/workload identity where available (cloud + CI runtimes).
- If static secrets remain, store them in a secrets manager and document rotation frequency and owner.
- Prohibit shared tokens across agents; shared tokens destroy auditability.

4) Authorization (week 1–2)
- For each tool, define minimum required scopes/roles.
- Deny-by-default for new tools: agents only call tools explicitly added to an allowlist.
- Segment access by data sensitivity (customer PII vs. logs vs. public data). If you can’t segment, treat it as sensitive.

5) High-risk actions & approvals (week 2)
- Define “high-risk actions” relevant to your business: refunds, permission changes, data exports, deletes, production deployments, infrastructure changes.
- Add an explicit approval gate: ticket ID, change request, or a human confirmation step.
- Ensure approvals are tied to an immutable action record (below).

6) Audit logging (week 2)
- Emit an action record for every tool call: agent_id, requested_by (user/system), tool, resource, policy decision, approval reference, timestamp, result.
- Ensure logs are queryable and retained according to your org’s policy.
- Verify you can answer: “Which agent touched this customer record?” and “Who approved this action?”

7) Kill switch & offboarding (week 2–3)
- Implement a single “disable agent” control (gateway switch, identity revocation, or both).
- Document an on-call runbook: how to revoke tokens, rotate secrets, and disable tool access.
- Schedule quarterly access reviews: agents accumulate permissions just like people.

Acceptance test (the only one that matters):
Pick one production agent and simulate compromise. You should be able to state (a) what it can do, (b) where that is logged, and (c) how to stop it—without guessing.