The misleading AB Test
I have many addictions, including a very common kind of addiction amongst entrepreneurs: data-analyctites.
We do a lot of AB Tests at Sampa and last Friday we just wrapped up test number 902 and are ready to start 903 (903 stands for March 2009, not nine-hundred tests). 902 was about when to show our customer about premium packages: Before they sign up, just after they sign up, or when they hit a limit on their plan. We also did it with another dimension, one showing one upgrade option, the other showing two upgrades options.
But this post is about a common mistake I saw at MSN when we released new versions of the product. As soon as we released a new version that had a dramatic different design from the previous version, a big change would happen on the data points we collected, sometimes up, sometimes down.
The problem with that was the fact that too much had changed at once. So, a new version that would cause a 7% gain in the number of page views was highly celebrated. But there was 12 changes that happen at once. Couldn’t some of the changes have brought a 15% gain while others took away 8%? We’ll never know because they were done all at once.
You might say a gain is a gain and that’s a good thing. I agree. It’s only a bad thing when people try to justify the 7% gain in one of the 12 changes and use that argument to push more changes forward. Well, if you changed so much on your homepage how do you know which changes are really responsible for a gain in PV, UU, conversion, upgrades, etc?
By far, the most common mistake I see people making is to change their product and launch a new homepage all on the same day. If you do that you really won’t know how much your user experience has impacted your business or how much your new branding has affected conversion and expectations.
You should not consider the individual pieces of your product without considering the whole, but you can’t justify gain/loss based on one of many features if you did it all at once.