AGENTIC PRODUCT MANAGEMENT PLAYBOOK Purpose Use this playbook to roll out AI agents that (1) continuously research customer + market signal, (2) generate PRD/spec drafts, and (3) act as copilots across meetings and execution—without creating security, trust, or workflow chaos. 1) Choose the first workflow (pick ONE) - Weekly Voice-of-Customer (VOC) brief - Competitive monitoring + pricing/release-note digest - PRD “spec compile” (draft + citations) - Meeting-to-actions copilot (decisions, owners, deadlines) Selection criteria: high frequency, repeatable structure, clear success metric, accessible data sources. 2) Standardize your inputs (avoid garbage-in) - Define a single taxonomy for: customer segments, churn reasons, request categories, severity levels. - Identify 3–5 canonical sources of truth (example): * Product analytics: Amplitude/Mixpanel dashboards * Support: Zendesk/Intercom tags * CRM: Salesforce fields (ARR, segment, stage) * Work tracking: Jira/Linear epics and labels * Knowledge base: Confluence/Notion PRDs + policies - Create a “definition sheet” (1 page): KPI definitions, event names, time windows, and segment rules. 3) Set agent output requirements (non-negotiables) - Every factual claim must include a citation (link to dashboard, ticket, doc, or transcript). - Include a “Known Unknowns” section (what data is missing). - Separate: Observations (facts) vs Recommendations (judgment). - Use your company’s PRD template (Problem, Goals, Non-goals, User stories, Acceptance criteria, Risks). 4) Permissions + governance guardrails - Start read-only. Draft into a sandbox (Notion/Confluence draft space). - Restrict tool access by default (least privilege): * Allow: search/read, summarization, draft creation * Block: sending external emails, deleting tickets, changing production configs - Require human approval for any publish/send action. - Turn on audit logging where possible (who ran the agent, what it accessed, what it wrote). - Block PII where feasible (redaction rules; do-not-ingest sources like HR docs). 5) Evaluation (treat it like production software) Create a small “golden set” (30–50 historical cases) and score the agent weekly: - Citation coverage rate (% of claims linked to sources) - Precision for requirement extraction (did it capture the real need?) - Spec completeness (did it fill all required sections?) - Stakeholder rating (1–5 from Eng/Design/Support) Define a rollback rule: if scores drop >10% after a model/prompt change, revert. 6) Rollout plan (60 days) Week 1–2: Grounding + taxonomy + sandbox drafts Week 3–4: Ship the first agent output on a fixed cadence (weekly VOC or PRD drafts) Week 5–6: Add limited write permissions (create Jira tickets, assign owners) Week 7–8: Expand to a second workflow + formalize evals + governance 7) ROI tracking (simple scoreboard) Track before/after metrics: - Cycle time: idea → ready-for-eng (target: -20% to -30%) - Hours saved per PM per week (target: 2–5 hours) - Rework rate (bugs/defects traced to unclear requirements) - Meeting load (number of alignment meetings per initiative) PRD COMPILER TEMPLATE (copy/paste) 1) Problem (with citations) 2) Goals (measurable KPI target + timeframe) 3) Non-goals 4) User stories (persona + job-to-be-done) 5) Acceptance criteria (testable) 6) Risks + mitigations 7) Open questions + owners 8) Evidence appendix (tickets, quotes, charts) If you implement only one rule: “No citation, no trust.” That single constraint turns an LLM from a fluent guesser into an accountable product teammate.