Nearly every company with an internet connection is involved in analytical experimentation. Go into any e-commerce company or tech platform and you’ll start hearing about product elements that have been tested: interfaces, emails, creatives, operational processes, pricing strategies, machine learning algorithms, the list goes on. Testing and experimentation is a core and essential feature within most businesses today. The impulse to learn through experimentation is a paradigm of the scientific method, and testing continues to be an excellent way to grow products and businesses.
But testing is not without fault, even if it comes in good spirit. At Tozan, we’ve lived and breathed testing and intimately understand its strengths and weaknesses. That’s why we decided to construct Tozan to be a superior experimentation platform. Below we’ll briefly discuss the three core challenges of traditional A|B testing and cover more detail over the next several posts. There are three core challenges in A|B tests: 1) Measurable Waste 2) Arbitrary Versions 3) Moving Target — these three challenges are interrelated, very common across all companies and industries, and are the central reasons why we built Tozan.
- Measurable Waste
The first core challenge with traditional A|B testing is waste creation — in fact, testing generates a measurable quantity of waste. If you test two versions of your product side by side at a 50:50 split and version 2 outperforms version 1 by 10%, then you’ve knowingly spent 5% of your product value. Even if a relatively small sample cohort is being exposed to the challenger, or new versions, that measurable and unnecessary waste accumulates, and in a competitive sector you are signing away value unnecessarily.
2. Arbitrary Versions
The second core challenge concerns which versions get tested in the first place. Most companies assign teams the task of designing new versions, or challengers. After some period of ideation and product development, the challengers square off against the mainstay version, or control. But the landscape of potential product versions is far too vast to be meaningfully explored in a traditional A|B test — in reality, you wind up comparing a set of arbitrary points from a much larger menu of potential variants. For example, if the team wants to test a pink background version vs. the current blue version, they have to omit the red, orange, green, purple versions, any/all of which might actually produce more value. We’ll wind up finding a local maximum between two points, but we want to find the global maximum for the business. The arbitrary version problem often leads into a cycle of more testing, which leads to more Measurable Waste Byproduct (problem 1). It also folds into the third core challenge, which is that products and businesses are moving targets.
3. Moving Target
The third core issue with traditional A|B testing is that it assumes that your business and product exist in a static vacuum. That’s not the case — markets evolve continuously, so it is suboptimal to impose a simple outcome from an A|B test onto a complex background. For example, let’s pretend you run a test between two versions for one month and find that version 1 performed better against the KPI. However, maybe recent user acquisition efforts have imported a new demographic of users, so we cannot be certain that version 1, which was gauged on the previous demographic composition, will resonate with the new cohorts. Seasonality is also a factor here. As conditions change in the market and world, previous test conclusions can become irrelevant. In a future post we’ll discuss how Tozan views experimentation as an always-on optimization problem, not as a discrete point-in-time activity.
In sum, there are three core challenges with traditional A|B testing that we built Tozan to overcome: 1) Measurable Waste 2) Arbitrary Versions 3) Moving Target
We’ll spend the next several posts diving into more detail on each of these common issues and understanding Tozan’s solutions.