During our predictive modeling office hours at work, someone mentioned CUPED (cue-ped) in response to a question about high variance in A/B testing with small group size (side note: sharing knowledge at work is highly encouraged). This is a fairly simple method for reducing variance. From the original authors:
“[CUPED] utilizes data from the pre-experiment period to reduce metric variability and hence achieve better sensitivity.”
Deng, Xu, Kohavi, and Walker
Marton (Bytepawn) breaks down the CUPED formula quite nicely:
Assume an A/B testing setup where we’re measuring a metric M, eg. $ spend per user. We have N users, randomly split into A and B. A is control, B is treatment. We have metric M for each user for the “before” time period, when treatment and control was the same, and the “after” period, when treatment had some treatment applied, which we hope increased their spend.
Let Yi be the ith user’s spend in the “after” period, and Xi be their spend in the “before” period, both for A and B combined. We compute an adjusted “after” spend Y′i.
The CUPED recipe:
1. Compute the covariance cov(X,Y) of X and Y.
Marton Trencseni, Bytepawn
2. Compute the variance var(X) of X.
3. Compute the mean μX of X.
4. Compute the adjusted Y′i=Yi−(Xi−μX)*(cov(X,Y)/var(X)) for each user.
5. Evaluate the A/B test using Y′ instead of Y.
This is a great way to increase statistical power with smaller groups and/or timeframes.