How To Get Started with A/B Testing

barronernst
Product Coalition
Published in
13 min readDec 2, 2020

--

Keys to A/B Testing

I recently hosted a webinar on the keys to A/B testing and how to structure these components for a company. I decided to post the notes I had as an outline as a blog post since I thought it was a good rough guide.

Therefore, what follows is a rough outline for how I approach A/B testing when I join a company or when I get started. If you are a company that is new to A/B testing, this can be helpful as a checklist to help get you started.

If you’ve been doing A/B testing for a while, then this can just be a helpful refresher for some key questions to ask yourself about your testing philosophy.

To be clear, it’s not in the format of a normal blog post. It was designed as an outline, so please excuse the formatting.

Should you be doing A/B tests as a company?

How much traffic does your product have?

  • Can you run more than 1–2 tests a month and get a statistically significant result?
  • If not, you should question whether A/B testing is right for your company at this point. You may not get much return on A/B testing and focusing on customer research and validation before building may end up being more relevant.

Do you have product-market fit?

  • Can you clearly identify the problem you are solving?
  • Do customers come back to the product day after day (or week after week) to leverage your product to solve this problem (what’s your retention rate? Does it flatten or does it go to zero?)

Are the concepts you want to test things that you can isolate in an A/B test? Or will you not get any learnings because you aren’t able to isolate properly?

Can you answer the following questions well:

Who is your customer?

What job or task or experience are they leveraging your product for? Do you fundamentally understand the core value that you are providing?

How will this A/B test or multi-variate test impact that or improve that experience or core metric? Or, if it’s focused on a sub-metric, how do you know that the sub-metric ladders to the core metric for your business?

What’s your core product loop? What’s the primary way people get onboarded, engaged, and re-engaged with your product?

  • E-commerce
  • Marketplace
  • Networking
  • SEO or viral content
  • Other?
  • If you haven’t cracked this, A/B testing is not going to solve your problem. It’s critical to figure out your core loop for onboarding first.

What does your retention curve look like for cohorts? Does it flatten out or does it go to zero? If you have a retention curve that does not flatten out, then you need to go back to the critical task of achieving product market fit and talking to customers before you advance to A/B testing. If your curve looks like the red line below, you are not ready to A/B test. You need to achieve product/market fit first!

Most A/B testing happens when you have validated a feature or a critical product flow. It happens at the flattening parts of the curve shown below. Essentially, most patterns will be that you find a new innovation or feature and then you optimize it to a point of marginal return until you have the next breakthrough of innovation. Growth is not linear.

Traffic

A key determinant in your ability to run A/B tests successfully will be traffic. In general, do you have enough traffic where you can conceivably get a good test result in a relatively short time period? It’s critical to think through in advance how long a test will need to run before you even start the process.

There are lots of simple tools out there, but important that you enter some various scenarios in to get a sense for how long it will take to generate a statistically significant test:

https://neilpatel.com/ab-testing-calculator/

https://www.surveymonkey.com/mp/ab-testing-significance-calculator/

I highly recommend playing with tools like this before you even start to run tests. Enter the traffic you’ll get in a week on the test and test different conversion changes to get an understanding of how big the change or how long the test will need to be in order to understand the level of difference you’ll need to make in order to be successful with the test.

Beyond just this, you also need to consider your ability to run more than one test at a time. You need to be careful running too many overlapping tests if you have low traffic, so the traffic will necessarily limit your overall velocity and ability to run tests if you don’t have enough traffic to get to validation quickly.

You also need to consider the part of the product/website/app where you are running the tests. Usually, a top of the funnel test can get to validation much more rapidly than a test on a checkout flow or somewhere else. Therefore, the upfront cost of a test in different elements of your product will cost more or less depending on the traffic and the time investment it will take for the test to give you a result.

If you don’t have enough traffic to run a test without getting to a statistically significant result within a reasonable timeframe, it’s more important to validate the concept with user research in advance and to have higher conviction about the feature before making the investment.

The more traffic = the less overall thought and investment needed to think through the experiment because the cost of running the experiment is lower with more traffic.

Remember that only 10–20% of tests are usually successful. Therefore, if you don’t have enough traffic to run enough tests, you may not feel like you are making an impact unless you have the traffic and team velocity to run many tests in a short period.

Goals or Targets for the Test

It’s critical that before you start running a test you are clear on what metric you are planning to move for the particular test. For example,

  • If you optimize for a registration flow, you are likely most closely going to be tracking your ability to get users to register with whatever your core registration method is (phone, email, etc)
  • If you are running a checkout test, then you are focusing on purchase
  • Virality is all about how much a user is sharing and whether other users are accepting those invites or clicking on links

It’s incredibly important to make sure you are clear on the metric you are expecting to directly impact when you launch the test. However, it’s also critical to think about the impact you want to have on the bottom line metrics for the business as well. For example, if you increased registration for an e-commerce website, that will seem like a positive, but if you don’t track if those customers are profitable or revenue-driving, then a positive result may not have the overall net impact you are expecting unless you take the time to understand how increases in your top of the funnel metric impact the bottom line for your business.

Think through all the top and bottom line implications of a test and how you will track those elements before you start launching or are in teh process of thinking through how to structure metrics for your A/B test.

Important not to only look at things like signup, conversion signup, but also elements like engagement of the cohort, purchase rate, referral rate, and other indicators that show you are doing a good job of bringing in good customers.

Channel or Platform for Testing

It’s incredibly important to realize there are many fundamental differences between different platforms or channels for A/B testing.

Here are some of the most common areas where companies regularly do A/B testing:

  • Email
  • Push Notifications
  • Registration
  • Checkout
  • Viral Components
  • Core Funnel Steps

Some key tips when it comes to testing each of these channels:

Email:

  • Usually high volume
  • Variables you can control: subject lines, preview line, content, CTAs, sender address.
  • Incredibly important to be careful and well structured when it comes to email testing. Running a lot of A/B tests in email is critical to increase open and click through rate. However, if you start to build email that feels too spammy or you send too many emails, you are at risk of hitting spam traps or having too many users reporting spam on you which can lead to suspends or blocks with ISPs. Key is to be rigorous about maintenance of your lists and making sure you have high quality in your ability to send.
  • Key to spend a lot of time testing subject lines as they are the critical gateway to an email getting opened. Work on dynamic subject lines, but also test things like emojis, interesting phrasing or other elements to see what is best to drive opens for your user base

The content of the email needs to be obvious and well structured

  • I spent a lot of time testing a variety of templates at a major flash sale site when I ran email there. It was often non-impactful and didn’t usually lead to major changes in CTR
  • Instead, it was more important to focus on bringing the right content to the top of the email through personalization or other segmentation
  • Also it may be that a template is not important at all. There’s been more success lately with limited or low template emails because sometimes they will get to the main inbox instead of promotions on gmail or other services

It’s worth testing a lot of elements in email because if you have many signups, this is usually a high volume channel

Need to be careful that you aren’t seeing a lot of unsubscribes

Push notifications

  • Lots of similarities to email. First key is testing your flow to get users to opt in for push notifications. You need to try double opt in and opt in at multiple points in the user experience to determine when is the optimal time to get a user to agree to receive push notifications from your product
  • The key in tracking push notification testing is realizing that you don’t have multiple channels like you do with email where you can separate marketing from other key notifications. You need to be careful to realize that any opt out will impact your ability to send push notifications generally about all topics
  • The push should lead to an action — users are increasingly opting out when they aren’t receiving value from the notification
  • Critical to test the copy and the content of the notification. The best performing notifications are also deep linking into the right part of your app. Important to consider a vendor like Branch or others here to make sure you are providing the right level of engagement and depth to your users

Registration and Onboarding

This post is a few years old but still fairly relevant: https://barronernst.com/4-ways-to-optimize-your-reg-wall/

The key to registration optimization is to determine how much you want to let users see of your product before prompting them to provide an email address or other information. Some products naturally will work better with most of the content behind a registration wall whereas others will require some initial browsing or usage experience before prompting for an email address. It’s critical to figure out the right balance for your product and this can be an area of optimization and testing as you get more users coming to your door.

Registration testing usually has a few critical variables:

  • Channel (mobile, web, app)
  • Source (direct, SEO, UA/paid marketing, viral referral)
  • Intent

Usually this is a part of your product where you have high traffic and less clear initial intent, so there may be a great opportunity for more testing. Therefore, it can be ok to take more risks testing at this level.

There are a number of variables to consider testing when you are looking at registration or top of the funnel testing. Critical things to consider:

  • How much of your product or experience can you expose before asking for information. It’s critical to think about testing this element over time to see the difference when you expose different portions of the product before asking for registration or other information
  • Some products will just not make sense without collecting information, so you may not have many options for how early you collect data from the customer. In that instance, the key will become testing regularly the onboarding and other elements to understand the right moments to ask for things like:
  • Push notification acceptance
  • Location sharing and the level of which you share it
  • Email address or phone number
  • When to introduce trials or other elements to the user experience
  • Testing should not just be focused on collection of information but also understanding how to introduce the customer to the product in a way that will lead to them becoming a long time user. If you re just collecting data and then providing no further critical onboarding or information to a customer, you aren’t going to have a successful onboarding experience that leads to valuable cohorts of customers.

Viral Components

  • When it comes to viral components, there are multiple critical elements to test and track
  • Sharing of your viral content (how often shared, to how many people is it shared)
  • Clicks and engagement with viral or shared content
  • New users or re-engaged users driven by sharing of viral or other content
  • Decent article on K Factor: https://medium.com/@adjblog/basic-overview-of-k-factor-in-viral-growth-models-for-your-startup-2ee641b04bfb
  • The key to understanding virality or shareable content is to break down what happens at each step of the viral funnel to understand where you have the biggest opportunity for testing and optimization
  • Need to determine if the biggest opportunity is in sharing, consuming, or taking an action after consuming of the content
  • If you don’t understand the components that lead to virality, you haven’t gotten to product market fit and it’s critical to go back to that step before you start A/B testing

Core Funnel Steps

It’s critical to define what your core funnel looks like when you are planning A/B testing and optimization. When you are thinking about optimizing and improving the core funnel or user experience of your product, you need to get good tracking in place throughout your funnel or core user experience to understand where you should be optimizing and improving the product as you go.

Here’s the example that I used for growth at One Kings Lane:

The goal of defining this for your company is to enable your growth team or your product teams to understand where they should optimize your product and improve it.

This should become the guideline for where to run critical A/B tests for your product and where you can make improvements. You shouldn’t be running a bunch of A/B tests at the checkout step of your product if you are seeing you have a problem with registration. Strategy and understanding of your product is key.

Users and Hypotheses for Tests

Now that you’ve spent a lot of time analyzing the variety of topics above, you should have a clear framework for your business. The last and most critical element of A/B testing is understanding your customers and how to generate hypotheses for the tests you want to run.

As mentioned earlier, the depth of thinking for the hypotheses of the test is largely based on a number of factors: traffic, stage of the company, level of product/market fit, etc

For example, if you have lots of traffic and a fairly well defined flow with clear data and you know you can get results quickly, the importance of a very well defined hypothesis for your tests is lower because the cost of running a test is lower.

The general rule is that the higher the cost of running a test, either due to development effort, time to validation, or any other element, the more time you should spend questioning the test and the hypothesis behind it. As tests get cheaper, a great hypothesis isn’t as critical, but it’s still a best practice to at least ask the question of your team and of yourself regarding why you are running certain tests.

A well formed hypothesis should outline why you are running the test, what change in user behavior you hope it will create, and also what metric you are planning to monitor and measure as the primary goal for the test.

I highly recommend creating a template for A/B testing at your company if you are new. Even if you are a solo PM or a leader of the product function, it can be highly valuable to force rigor into the A/B testing process.

It could look something like:

  • Area to be tested of the product
  • Platforms to be tested
  • Types of users included in test
  • Hypothesis for test
  • Description/Requirements
  • Metrics monitored/goal metrics for the test

Testing Stack

Need to ensure you have the right setup for your testing tools in order to make this process as easy as possible. I have used a variety of tools, so I’m happy to discuss what has worked historically, but here are a few that are critical:

Mobile testing

  • Leanplum
  • Apptimize

General A/B Testing

  • Optimizely

Analytics

  • Amplitude
  • Other options I’m not as big of a fan, but we can discuss them

Email

  • Braze
  • Salesforce Exacttarget
  • Sendgrid
  • Mailchimp

Push Notification

  • Leanplum
  • Braze

Customer Data Platforms

  • Segment
  • mParticle
  • Blueshift

Deep Linking

  • Branch
  • Appsflyer
  • Many others

Links to some of my previous posts on the topic:

--

--

Product and Growth - follow me at http://www.barronernst.com. Bball, running, poker, skiing, cal sports are fav. hobbies.