AGENTIC FEATURE LAUNCH CHECKLIST (2026)

Goal: ship ONE AI-driven action (a write to an external system) with controls that survive security review.

1) Define the action contract
- Name the action and classify it: reversible vs irreversible.
- List all side effects (external emails sent, records changed, money moved, access granted).
- Define an “undo” path: rollback, compensating action, or explicit “cannot be undone” warning.
- Decide what a user must see before approval: diff, recipients, amounts, records touched.

2) Identity + permissioning (non-negotiable)
- The agent must act “on behalf of” a user or a tightly-scoped system role.
- Use per-user OAuth where possible; avoid shared admin tokens.
- Enforce the same RBAC rules your UI enforces.
- Scope permissions to the minimum tool operations (read vs write; create vs delete).
- Add time-bounded auth for sensitive actions (force re-auth).

3) Policy gates
- Create a simple policy matrix: allow / require approval / deny.
- Gate conditions: action type, target domain, amount threshold (if relevant), bulk volume, external recipients.
- Ensure gates are enforced server-side (not only in the client).
- Add rate limits or action budgets per run.

4) Provenance + explainability (operational, not marketing)
- Log retrieval sources (system, doc/record ID, timestamp).
- Store the proposed change as a diff against current state.
- Store tool call attempts (operation, parameters, success/failure).
- Show provenance in the UI where the user approves.

5) Observability + audit export
- Write an immutable “agent action event” for every attempt.
- Include: who requested, on-behalf-of identity, tools used, policy decision, and final status.
- Make logs searchable by user, object ID, and run ID.
- Provide export to common SIEM destinations (at least via webhook or file export).

6) Kill switch + incident response
- Implement a global disable that stops actions even if the UI is unavailable.
- Implement connector-level disable (reads can remain; writes off).
- Add an emergency mode to force approvals for everything.
- Document who can trigger the kill switch and how it’s audited.

7) Pre-launch tests
- Permission tests: confirm the agent cannot access data the user cannot.
- Negative tests: prompt injection attempts, tool misuse attempts, cross-tenant boundary checks.
- Replay tests: can you reconstruct “why it happened” from logs alone?
- Rollback drills: practice undo/compensation for the chosen action.

Ship criteria: if you can’t explain identity, policy, provenance, and shutdown in under five minutes to a security reviewer, it’s not ready.