Interface Health Review (IHR) Template Use this in a 60–90 minute working session for one critical user journey (e.g., login, checkout, deploy, billing). Goal: verify the system can change without surprise. 1) Scope - Journey name: - Entry point (UI/API/event): - Systems touched (services, queues, DBs, third parties): - Single accountable owner (name/team): 2) Contract Map (Interfaces) - List each service-to-service API or event contract involved. - For each: owner, versioning approach (if any), and how breaking changes are prevented. - Identify the “most fragile” contract (the one that causes the most churn). 3) Observability Reality Check - Do we have traces/logs/metrics tied to this journey end-to-end? - What is the primary SLI for success (e.g., success rate, latency, reconciliation accuracy)? - Where are alerts defined, and who receives them? - What’s the blind spot (what could fail silently)? 4) Release and Reversibility - How is this journey deployed (pipeline path, approvals, gates)? - Rollback method: revert, feature flag off, config change, traffic shift? - Fastest safe rollback path documented? (link) - What’s the realistic blast radius of a bad deploy? 5) Security and Data Handling - Authn/authz boundary: where is it enforced? - Input validation: where is it enforced? - Secrets: confirm none are stored in repo; identify secret manager used. - Third-party tools: what data is permitted to be pasted into AI tools? 6) Decision Durability - Link to ADRs/design notes affecting this journey. - If missing: write a short ADR now (problem, decision, tradeoffs, alternatives rejected). - Identify one repeated debate you can eliminate with a decision record. 7) Action Items (must be concrete) - 1–3 fixes to improve interface health (owner + due date): a) b) c) 8) Stop Doing List - One behavior to stop immediately (e.g., merging large mixed diffs, shipping without telemetry, bypassing review gates). Rule: If you can’t name owners, rollback paths, and primary SLIs for the journey, don’t celebrate “shipping faster.”