Overview

The Ab Test Stats MCP server computes statistical metrics for A/B testing, including sample size requirements, significance testing, and hypothesis evaluation with precise p-values and confidence intervals. It processes conversion rates, metrics like click-through rates, or any quantifiable outcomes from split tests.

Key Capabilities

sample_size_calculator: Calculates minimum sample sizes needed to detect specific effect sizes at given power levels (e.g., 80%) and significance thresholds (e.g., alpha=0.05), accounting for baseline rates and minimum detectable effects.
significance_test: Runs chi-squared or z-tests on A/B test data to determine if differences between variants are statistically significant, returning p-values and confidence intervals.
hypothesis_testing: Evaluates null hypotheses (e.g., no difference between variants) using exact methods like Fisher's exact test for smaller samples or t-tests for continuous metrics, providing effect sizes like Cohen's d.

These functions accept inputs like variant sample sizes, success counts, or means/std devs, and output interpretable results for immediate decision-making.

Use Cases

Landing Page Optimization: A marketer inputs visitor counts and conversion rates from control and variant pages into significance_test to confirm if a new headline lifts conversions by 10% with 95% confidence.
Feature Rollout Validation: Product managers use sample_size_calculator to plan user cohorts for testing a new app button, then apply hypothesis_testing on engagement metrics post-experiment.
Email Campaign Analysis: Growth teams feed open/click data from A/B subject lines into significance_test to identify winners before scaling sends.
Pricing Test Evaluation: E-commerce analysts test price variants with hypothesis_testing on revenue per user, checking p-values against business thresholds.

Who This Is For

Data analysts running frequent experiments, marketing teams optimizing campaigns, product managers validating features, and growth hackers needing quick, reliable stats without full stats software. Requires basic familiarity with A/B testing concepts like lift and power analysis.