AGENT SURFACE SPEC — 1-PAGE TEMPLATE (2026) Goal: Define ONE workflow where an agent can act safely. If you can’t fill this out without hand-waving, do not ship write-access. 1) Workflow (name + scope) - Workflow name: - Target user (role): - System(s) of record touched (e.g., Jira, Salesforce, Gmail, GitHub): - Read-only / Draft / Propose / Auto-execute (pick one): - “Done” definition (observable): 2) Inputs and triggers - Manual trigger (button/command): - Event trigger (optional): - Required context objects (links/IDs, not “general context”): - Out-of-scope explicitly (what the agent must refuse): 3) Tool contracts (actions) For each tool action, list: - Action name: - Required parameters (schema-level): - Allowed values / constraints: - Idempotency key strategy (how you prevent duplicates): - Failure modes you will surface to the user: 4) Proposal format (what user reviews) - What does the agent output as a proposal? (diffs, checklist steps, queued actions) - Where does the proposal live? (task inbox, PR-style view, doc suggestions) - Edit controls: what can user change before approval? 5) Approvals + permissions - Who can approve? (user, manager, admin) - Approval policy options (minimum): - Always require approval - Require approval for specific action types - Auto-execute only in specific scopes (project/label/customer segment) - Auth model: - Per-user OAuth token OR service account with scoped permissions - Token rotation / revocation behavior 6) Audit + observability - User-visible timeline (must include): trigger, tool calls, approvals, executed changes, errors - Admin export (format/location): - What you log for debugging without storing sensitive data unnecessarily: 7) Rollback plan - What is reversible? List each action and reversal method. - What is not reversible? Define the “are you sure” friction. - Time window and UX for undo: 8) Safety and abuse constraints - Rate limits (user-facing behavior): - Data boundaries (projects, tenants, repos): - Prompt injection handling approach (at minimum: treat external content as untrusted input; restrict tool calls by policy): 9) Pricing unit (pick one) - Per approved action - Per completed workflow - Per integration / connector - Per environment (dev/staging/prod equivalent) Define the unit in one sentence so Finance and Sales can repeat it without translation. 10) Launch checklist (gate to production) - [ ] Read-only mode shipped and used (qualitatively validated by real users) - [ ] Draft/proposal UI is in the system of record (not a separate “AI page”) - [ ] Approval + permission model documented and tested - [ ] Audit view exists for users AND admins - [ ] Rollback exists for top actions; non-reversible actions have explicit friction - [ ] Idempotency and retry behavior tested against real API errors and rate limits - [ ] Support runbook written: common failures + how to diagnose from logs - [ ] Kill switch: ability to disable agent execution by scope (tenant/project/user) If you complete this and still want to ship “fully autonomous,” you’ll either be ready—or you’ll have talked yourself out of it for the right reasons.