CAPABILITY CONTRACT TEMPLATE (AI Tool Interface)

Purpose
Define a single product capability that can be safely invoked by an AI system (or another service) with clear permissions, predictable behavior, and auditability.

1) Capability Name
- Tool name (verb_noun):
- Human description (one sentence):
- Non-goals (what this tool will not do):

2) Inputs (Schema)
- Required fields:
 - field_name: type, allowed values, validation rules
- Optional fields:
 - field_name: type, defaults
- Sensitive fields (PII/PHI/financial):
 - field_name: handling rules (redaction, encryption, retention)

3) Outputs (Schema)
- Success response fields:
- Partial success fields (if applicable):
- Links to resulting objects (IDs, URLs):

4) Authorization & Scope
- Actor types allowed: (user, admin, service account)
- Required scopes/roles:
- Tenant/org boundary rules:
- Consent requirements:
 - How consent is captured
 - How consent is revoked

5) Side Effects & Idempotency
- External side effects (emails sent, records created, payments initiated):
- Idempotency strategy:
 - Idempotency key format:
 - Duplicate request behavior:
- Retry safety:
 - Which errors are safe to retry
 - Backoff guidance

6) Policy Enforcement (Server-side)
- Hard constraints (must always be enforced):
- Soft constraints (warnings/approvals):
- High-risk gates:
 - Require human approval? (yes/no)
 - Thresholds (qualitative if no numbers):

7) Error Taxonomy (Machine-actionable)
Define stable error codes and what the caller should do.
- INVALID_INPUT: caller should fix fields
- UNAUTHORIZED / FORBIDDEN: caller should request scope/consent
- NOT_FOUND: caller should refresh state
- CONFLICT: caller should re-fetch and reconcile
- RATE_LIMITED: caller should slow down / reschedule
- TRANSIENT: caller should retry
For each code, specify:
- Example message
- Fields returned (e.g., field_errors)

8) Observability & Audit
- What to log (structured):
 - actor_id, tenant_id, tool_name, request_id, idempotency_key
 - input summary (redacted), output summary (redacted)
 - policy decisions (approved/denied, rule triggered)
- Correlation strategy:
 - trace_id propagation across agent + tool service
- Retention policy:

9) Rate Limits & Budgeting
- Rate limiting unit (per user, per org, per token, per time window):
- Behavior on limit:
 - return RATE_LIMITED with retry_after guidance
- Cost/budget controls (if model calls are involved):

10) Test Plan (Must-have cases)
- Valid request happy path
- Invalid input per required field
- Unauthorized scope
- Cross-tenant access attempt
- Retry after transient failure
- Duplicate request with same idempotency key
- High-risk action triggers approval gate

11) Rollout Plan
- Internal-only mode
- Limited tenant allowlist
- Monitoring dashboards and alerts
- Kill switch / feature flag

If you can’t fill out sections 4–8 clearly, the tool isn’t ready. Keep the capability smaller until it is.