Translate A/B test results into a plain-English decision memo

Data & Analysis ab-testingexperimentationdecision-making

Stat-sig outputs from an A/B test rarely translate directly into a shipping decision. This prompt converts raw test results into a clear recommendation that includes confidence level, practical significance, and the right caveats.

Prompt

You are an experimentation analyst. I need help turning raw A/B test results into a decision memo for a non-technical stakeholder.

Test details:
- What was tested: {{TEST_DESCRIPTION}}
- Primary metric and goal: {{PRIMARY_METRIC}}
- Secondary metrics tracked: {{SECONDARY_METRICS}}
- Results (paste numbers, including sample sizes, conversion rates, p-value or confidence interval): {{TEST_RESULTS}}
- Test duration and traffic split: {{TEST_DURATION_AND_SPLIT}}

Follow these steps:
1. State in one sentence whether the test reached statistical significance and at what confidence level.
2. Calculate or confirm the relative and absolute lift on the primary metric. Flag if the lift is statistically significant but practically small (e.g., <0.5% absolute change).
3. Check the secondary metrics: did any move in a direction that would offset the primary metric gain? Flag any guardrail metric violations.
4. Identify at least two reasons to be cautious about generalizing this result (e.g., novelty effect, limited segment coverage, short test window).
5. Write a recommendation paragraph: ship, iterate, or do not ship — and state the single most important reason.
6. List two follow-up questions the stakeholder is likely to ask and pre-answer them.

This prompt assumes you have already run the statistical test elsewhere. It does not perform significance calculations from scratch.

Variables to fill in

{{TEST_DESCRIPTION}}
{{PRIMARY_METRIC}}
{{SECONDARY_METRICS}}
{{TEST_RESULTS}}
{{TEST_DURATION_AND_SPLIT}}

How to use this prompt

Copy the prompt above (Copy button on the top-right).
Replace each {{VAR}} with your own value. Variables: {{TEST_DESCRIPTION}}{{PRIMARY_METRIC}}{{SECONDARY_METRICS}}{{TEST_RESULTS}}{{TEST_DURATION_AND_SPLIT}}.
Paste it into one of the recommended tools below.
Iterate: tighten constraints in the prompt if the output is generic.

Why this prompt is structured this way

The prompt is split into explicit steps because LLMs do better when the path is named, not implied. Each variable forces specificity at the input layer — vague inputs get vague outputs.

Heads up: some of the links on this page are affiliate links — meaning we may earn a commission if you sign up after clicking, at no extra cost to you. We only recommend tools we'd put on our own stack. You can see our full affiliate disclosure here.

Pair this prompt with a tool

Notion AI

$8/user/mo add-on

AI baked into the docs/wiki/projects tool you already use.

Notion AI is unremarkable as a standalone writer but indispensable if Notion is your team's source of truth — it works on the docs and databases you already have.

productivitywriting

Claude (Anthropic)

$0/mo (Pro at $20)

Frontier model with long context and strong reasoning.

Claude (Opus / Sonnet / Haiku tiers) is the assistant favored by writers and engineers who care about reasoning quality and tone. 1M token context on Opus.

writingcodinglearning

ChatGPT (OpenAI)

$0/mo (Plus at $20)

The category-defining general-purpose AI assistant.

ChatGPT has the broadest feature surface: image gen, voice, custom GPTs, web browsing, code execution. Often the right default; sometimes beaten on specific tasks by Claude or Perplexity.

writingcodinglearning

Perplexity

$0/mo (Pro at $20)

AI search engine with citations.

Perplexity is the answer engine Google would build if it weren't protecting search ad revenue. Cited answers, follow-up questions, focused source modes.

learningdata

The PlaybookPrompts weekly

One short email per week. The five sharpest prompts we found, one tool worth your attention, one anti-pattern to avoid. Unsubscribe in one click.

Newsletter signup is not configured. Set PUBLIC_NEWSLETTER_USERNAME in the build env.