Agent Readiness Sprint Pack (1 Week) Goal Ship one supervised agent workflow that performs a real state-changing action with provable controls: scoped auth, approvals, audit logs, rate limits, and rollback. Day 0: Choose the workflow (30–60 minutes) - Pick ONE action that changes state (examples: issue a refund, grant app access, update a billing field, rotate a credential, merge a PR). - Write a tight success condition: “Agent can propose and execute X, but only with approval Y, within scope Z.” Day 1: Tool contract - Define a single tool endpoint with strict schema (JSON). Include: - dry_run: true/false - idempotency_key - explicit currency/units/timezone fields if relevant - List validation rules the server must enforce (eligibility, limits, required fields). - Define error semantics: what errors are retriable vs terminal. Deliverable: Tool spec + example requests/responses. Day 2: Identity + authorization - Create a dedicated service identity for the agent (no shared tokens). - Implement least-privilege scopes (tool-specific, resource-specific if possible). - Decide how human identity is captured for approvals (SSO group, ticket assignee, on-call). Deliverable: Auth scopes document + token rotation plan. Day 3: Approval policy (server-side) - Define approval triggers by risk level: - Always approve (e.g., refunds above threshold, access grants to production) - Auto-approve within policy (small, reversible actions) - Implement the approval gate in middleware/workflow engine, not in prompts. - Ensure “deny” is an explicit state that ends execution. Deliverable: Policy rules + approval UI/ticket integration. Day 4: Observability + audit - Log every tool call with: - agent identity, approving human identity - tool name, parameters (redact secrets), result - correlation IDs across steps - Add rate limits and budgets (requests per minute, maximum tool invocations per task). Deliverable: Audit log query that reconstructs a full action chain. Day 5: Safety drills - Kill switch: one action that halts the agent and revokes its credentials. - Rollback: document compensating actions; support dry_run diff output. - Tabletop exercise: simulate a bad tool call, a loop, and a permission failure. Deliverable: Incident runbook + evidence the kill switch works. Acceptance checklist (ship/no-ship) - The tool endpoint rejects out-of-scope requests deterministically. - Approvals are enforced server-side and can’t be bypassed by prompt changes. - Idempotency prevents duplicate mutations. - Logs can answer: who approved, what was changed, and when. - You can pause the agent immediately and rotate/revoke its credentials. Notes Avoid expanding scope. One workflow with real controls beats five demos held together by prompts.