The A/B Test Trap

With the rise of the Lean Startup, everybody and their granny buys into A/B testing. That landing page for your yet to be built startup idea? You can A/B test it at Unbounce. Got a site you want to A/B test? Try Optimizely. You’re a big company and you want to run lots of tests? Maybe Test & Target is your ticket.

What each of these solutions offers is the promise that anyone can be Amazon, Facebook, or Google in their approach to site optimization. As with most things though, the devil is in the details. How long should I run the test to ensure statistical power? Most statisticians will inevitably say, “That depends!” It depends on the the level of statistical significance we hold ourselves to, it depends on the size of the effect on the target variable e.g. clickthru or sales, and it depends on the sample size, among other things.

The point is that A/B testing done right involves a few more considerations beyond randomly swapping pages and observing differences. The same is also true of in-store testing in the case of brick & mortar retailers. Got new signage you want to test, a new merchandising strategy, or a new kiosk? It’s not enough to just look at a simple pre-post of the featured items. Most big retailers of course know this, but it doesn’t hurt to be reminded of the importance of setting up a real design of experiment prior to a test. Otherwise, you’re most likely being deluded by randomness…

With that in mind, there are a couple of books I’ve been reading recently that have proven useful. The first, called Testing 1 - 2 -3, is published by Stanford University Press. This book focuses on marketing and service oriented experiments (as opposed to manufacturing) and offers a good overview of full and fractional factorial design. There are lots of examples as well as a review section at the end of each chapter with exercises that the reader can test their comprehension on.

The other book is called Optimal Design of Experiments: A Case Study Approach, by Wiley Press. Once you get past the somewhat off-putting approach of the authors narrating case studies via a dialog of he said, she said, there’s a lot of good content in here. Whereas Testing 1 - 2 - 3 spends a bit more time in the introductory chapters explaining the math, this book dives immediately into examples and focuses on practical explanations.

Whichever book you choose, if you’re doing A/B tests at your company, there’s much to be gained by spending some time brushing up on the principles of experimental design. Aside from being confident in the results you see, multi-variate factorial design should enable you to see interactions between variables, and perhaps more importantly, ultimately help you to move faster through your tests…

comments powered by Disqus