Guide (Introduction, Implementation and How it Works) to A/B Testing and Split URL Testing (VWO, Optimizely, Google Analytics)

Recently I’ve been investing a lot of time into A/B (split) testing/experiments and analysing their results. Slowly I’ve started to love this way of rolling out changes on the web (can be surely used for mobile apps as well), as it is a complete data-driven approach and can produce quite surprising (or shocking, how you see it) results at times. With this approach you’ll always know what works (instead of speculating) with firm evidence and eventually drive more sales or signups or whatever that matters to you. That said, I would recommend not A/B testing just anything and everything but always find the right situations or cases that would make sense to experiment. If you’ve enough bandwidth (extra time) then use this approach to keep on analysing your visitor behaviour and draw learnings that help you make more prospects achieve the objectives that you’ve set for them, eventually turning them into customers.

I’ll start off with basic (yet comprehensive) introductions to the different types of website testings (surely these techniques can be exercised in a mobile app too) and then cover how you can start doing testing right today using various third-party easy-to-use softwares/tools. Finally I’ll also explain how it all works which is going to be more of a technical piece (for the technically-interested folks).

What's the one thing every developer wants? More screens! Enhance your coding experience with an external monitor to increase screen real estate.

Heads up: The article might seem lengthy but I ensure it’s not going to take much time to complete. The content flow should be nice and easy and the overall complexity level is fairly low. So without further ado, let’s begin!

The Tools

Before we actually kick off with all sorts of information, let me tell you that I’ve used these three tools for A/B testing:

We started off with Optimizely for our A/B testing needs but eventually moved on to VWO as we found it more intuitive and well designed in terms of UI/UX. It is easier to use (IMHO) as well as has some extra nifty features here and there. I’ve also used Google Analytics Content Experiments in the past, though it is very limited feature-wise. Surely it can get your job done though if you’re a technical person who can modify HTML/CSS/JS (and server-side) code several times to adjust the test. Other tools will let you create multiple goals and assign them to your experiment and then help visualize the data/reports in a much intuitive way. Along with heatmaps, user segmentation and lots of other engagement data, VWO or Optimizely will prove to be better than GA Content Experiments.

My objective is not to evoke a sense of bias in you with all these information but more to give you a fair idea about the various website/app testing tools out there in the market. Also having used all these tools, you could say I’ve a fair idea of how A/B testing is done and the final data is analysed practically.

A/B (Split) Testing

a/b (split) testing

Let me explain A/B or Split testing first. A/B testing basically means that you have two different versions of your website (V1 and V2), where you show V1 to 50% of your traffic and V2 to the other 50%. Whichever performs better in terms of conversion is your winning version (or variation) that you finally start showing to 100% of your traffic. It’s that simple!

Do note that it is not necessary that the two variations have to be radically different, like two completely different web designs altogether. Instead of two entirely different websites (multiple pages), the two variations can just be a single web page with a small/medium difference in the content. For instance imagine an e-commerce product page where in V1 (or original version) the product image shows on the left and the product title/description is on the right where as in V2 (new variation) the placement is quite the opposite (content on the left and image on the right). Instead of changing the entire layout, you might consider doing minor changes like changing the heading and an image. That is also perfectly valid.

So basically you can either test macro changes like an entire website re-design or even a micro change like a change in title of a particular web page.

Of course, factors like the number of variations and traffic split are variable:

  • You can have multiple versions/variations – V1, V2, V3 … Vn
  • You can split traffic in any percentage – 50/50, 40/60, 80/20. With 3 variations the split could be – 33/33/33, 25/25/50, 55/25/20 and so on.

If you’re using a nice software (like one of those tools I listed above) you can even play with the percentage of your traffic you want to include in your experiment. You could include only 20% of your traffic in the experiment or even 40% (or whatever number). So if you have an experiment with 3 variations (V1, V2, V3) where the traffic split is (25/25/50) and the traffic under consideration for the test is 40% then you effectively end up sending 25% of 40% of-entire-traffic to V1 and V2 where as 50% of that same chunk to V3. Good one!

Controls and Variations

control and variations

It is very important to get these two terms right with regards to A/B/n (more than one variation) testing. When experimenting you’ll have your default version which is called control and one or more other versions which you want to test against the control. These one-or-more-other-versions are called variations technically. So you’ve your control version of the website where you make small/medium/large changes to one or more pages (tiny headline changes to complete web re-designs) to create one or more variations.

So A and B are different versions of a property (website/webpage or app/screens) where you can call A as the control and B as a variation. Although A/B testing is also used to refer to experiments where you have multiple variations (control, V1, V2, V3), but some also refer to such setups as A/B/n testing which is technically more appropriate.

Multivariate Testing

multi variate testing

There’s another type of split testing called Multivariate testing. Understanding this piece is fairly simple. Imagine the pricing page of any SAAS business. The page will have a heading, a sub-heading probably and then a couple of pricing plans. There will obviously be more information below the pricing section but we’ll constrict our focus on this part of the fold only.

You could definitely test different headings on the pricing page like (imaginary examples):

  • Start your 14-day Free Trial
  • Signup for [Business/Product Name]
  • Amazing Plans Just For You

Send a 33/33/33 split of your traffic to each version/variation and you’re A/B testing already. But imagine a case where you had 3 different types of headings, 3 sub-headings and 3 different designs (maybe placement-wise or UI-wise) for the pricing plans. Now you want to test a possible combination of all these elements to see which will perform better. In that case, you might end up having 27 (3 x 3 x 3) variations or versions to test where each will receive a 1/27 split of your traffic. This type of testing where you have different types of elements of which you want to see which combination performs best in terms of conversions is called Multivariate testing.

Bonus: See how Highrise had tried different headings (ages ago) to increase signups on their pricing page.

So as a rule of thumb, whenever you have different versions of different elements that might form multiple combinations that you want to test among your users, Multivariate is the way to go! Don’t confuse this case with one where you have 2 different versions of your heading (H1, H2) and 2 different versions of an image (I1, I2) but you just want to test the combinations – (H1, I1) and (H2, I2) – this is a normal A/B testing candidate, not a Multivariate one.

Split URL Testing

split url testing

What the heck is this now ? Split URL testing ? It’s all like A/B testing only but just that the versions (variations) exist on different URLs either on the same domain or different domains. In A/B testing if you have two different versions of a particular component on the home page, you can drive 50% traffic to each version right on to the same URL. But in this case, you can have the different versions hosted on different URLs and drive the traffic split to the various URLs. Here are some URL examples for two variations:

  • http://example.com and http://example.com/variation1
  • http://example.com/variation1 and http://example.com/variation2
  • http://example.com and http://anotherexample.com
  • http://example.com/variation2 and http://anotherexample.com/variation2

You’d generally want to do this type of testing when you’re testing two majorly different versions of the same page. Basically imagine a new design of your home page altogether. There have been times where for major re-designs like a new website design altogether or a new mobile responsive version of the current website, I used the normal A/B testing to set custom cookies and then reload the page basis the variation cookie that was set. Not only was this method weird but it also skewed the results at times when the refresh would happen too late or just go wrong. It’d also impact other tracking tools which just didn’t make any sense. Split URL is a a great candidate for this kind of testing and I wish I had gone this route back then.

Bonus: There’s a Highrise case study on how they tried totally radially different variations to boost their conversions. Here are Part 1, Part 2 and Part 3. You’ll learn loads from them.

What are Conversions and Goals ?

goals and conversions

I’ve been mentioning this keyword all around but what exactly is it ? Conversions is basically what you care about. It is the success rate of an expected Goal that you’d have or set in your A/B testing software or process. Let’s understand with a couple of examples.

Let’s say you re-designed your homepage altogether. First version had a “Signup” button for your product that lead to a signup page or opened up a signup modal where as the second version has a signup form right there above the fold (in the initial viewport), right when the page loads. Seems easier to signup in the second version right ?

In this case you clearly “care” about signups. So that is going to be one of your goals. Goals are what you think is going to be a goal or an achievement of the visitor/user on your website. Different tools let you setup this goal for an experiment in different ways. You can either:

  • Setup a goal to track page visits on the successful page that you see right after the signup.
  • Setup a goal to track the signup submit button click but this is not entirely fool-proof as there can be validation errors and the clicks will increment the total count in your reports/results, hence increasing the conversions.
  • Setup a goal to track form submits to a particular URL (your signup URL in this case).
  • Setup a goal to track something else like triggering an event.

Effectively what happens is that, in the results you will see the total number of people landing for each variation on the home page and then if you are have setup a Goal that tracks page visits on the successful page then you will see the total number of people who reached there for each variation which gives you the conversion number and hence the conversion rate. Here’s an example:

  • V1 (50/1050) – 50 people out of 1050 who landed on the homepage in V1 converted (4.76%).
  • V2 (200/1100) – 200 people out of 1100 who landed on V2’s homepage converted (18.18%).

Clearly the second variation has a way higher conversion number/rate for your Goal and hence you will want to update your design with that same version.

Split Testing Scenarios

Let’s explore a couple of different scenarios where you will probably want to A/B test:

  • You started your company six months back and launched a brand new website for it. Now you find that design quite antiquated and not so intuitive because your business model has changed a lot in the last six months. You come up with a new design of the website but it might be a bad idea to just go ahead and launch it. Instead you should use a third-party software (one of the above) or have an internal implementation to A/B test to split your traffic between the old and the new designs and see which performs better in terms of conversions.
  • You’ve decided to try different design for a particular component. The component could be as simple as a button or as complex as an entire search widget or a header or that signup form on your home page right above the fold. I’d say definitely A/B test that. Of course don’t start A/B testing very small things like useless buttons or tooltips here or there but do test what could improve your entire funnel and what matters for your business. A button is not worth A/B testing unless it leads to a signup or purchase conversion and that’s what you care about.
  • You want to build a mobile optimized/responsive version of you desktop site.

Be really really smart with what you A/B test. Don’t waste a lot of time making silly variations that wastes your time and energy in first thinking about them, then making the changes and then analysing the results. I’ve seen a lot of people make that mistake and it’s not worth it, trust me! It could also possibly piss off other people with whom you work in tandem.

Bonus: Here are some case studies for fun with surprising uplifts.

Protip: Although these case studies might evoke that urge to start A/B testing right away, don’t be random with it. What worked for those brands may not work for you. Don’t start making small changes like changing the color of a CTA or modifying the headline expecting major uplifts like those case studies. You’ll actually end up wasting weeks or months in time. Feel free to take some of those numbers with a pinch of salt and use the content as an inspiration to start researching and analysing how you could bring about a radical change on your website leading to a radical uplift in the conversion and hence revenue/engagement.

A/B, Multivariate, Split URL – When to Use Which ?

It’ll be good to have some thumb rules or at least basic ideas around which type of testing to use in which scenario. You probably know the answer by now but I thought a devoted section will reinforce the mindset.

Use the normal A/B test for almost any sort of new feature launch or changes on any webpage. Following are some cases that are good candidates for plain simple A/B tests:

  • Redesign of an entire component like a header, footer or search widget.
  • Modifying few elements on a particular page or even multiple pages like heading, buttons, tables, forms.
  • A new UI change like some form of competition comparison indicator, videos or images addition/removal, layout changes, new UI-intensive feature launch, etc.

Note: Do use your intelligence to experiment only when one is required. If something will impact your conversion funnel that matters, then only go the A/B testing route. If it is about some FAQ design change where you or your designer are pretty sure that the UI is getting a decent revamp, don’t go the experimentation route as it might not be fruitful at all.

If you ever want to play around with the combinations of multiple versions of multiple elements, go ahead with Multivariate testing. Although keep in mind that a multivariate experiment will simply require a lot of traffic to produce a reasonable winning result/combination. Why so ? Because the number of combinations will be very high. It will increase massively with the addition of a single variation or a single element. This is the formula that you’ll always have to remember:

Total # of Variations = [# of E1 Variations] x [# of E2 Variations] x [# of En Variations]

… where the total # of variations is actually all the combinations up for testing and E1, E2 … En are the number of elements with their multiple variations.

Finally, Split URL testing should probably be only used when testing major re-design. Like a completely different homepage (or product, checkout, payment page) redesign or a website redesign. So out of 100% traffic coming to yoursite.com, 50% will see V1 design of homepage (and/or further pages) whereas 50% users will be redirected to yoursite.com/variation2 to see V2 design of homepage (and/or further steps/pages). If you’re wondering how this is technically implemented then we’ll be covering that in a bit.

Another case for Split URL is a brand new mobile version/design of your current site, which is sort of similar to the previous case.

For me, 9 out of 10 tests have been A/B, 1 Split URL and probably none Multivariate. YMMV!

Split Testing in Action

We’ve spoken a hell lot about the concepts but how is it all put into action ? Well, for that you can (and probably should) use third-party tools like VWO, Optimizely or Google Analytics content experiments as I already mentioned earlier. I’ll talk about the implementation keeping VWO in mind (and using it as examples) but it is not much different from Optimizely (except the UI/UX) as well as conceptually much similar to content experiments. Surely you can build something in-house too or have a plain simple internal A/B testing framework coded, but the implementation process won’t be much different from using something like VWO, hence whatever content follows won’t be irrelevant.

I’ll list the integral steps that are generally required to setup and run an experiment on VWO as well as GA Content Experiments.

1) Create a Campaign

In VWO go to the create section, you’ll find multiple options (A/B test, Split URL, Multivariate, etc.). We’ll just go ahead with A/B Test type as that’ll be enough and you should easily be able to figure out the others once you know this one.

Creating an A/B Test type will ask you for your website URL so just punch in that data. Not only will this URL be used to preview your website in their live visual editor (which is pretty kickass), it will also be the URL where the experiment will run for the users once they land on this page either directly or through some internal/external referrer. So this will be the URL basically where VWO will drop cookies to attach a particular visitor to your experiment (with one of the variations). So for your visitors, the experiment begins from here.

Surely once the campaign is created, you can go to “Settings > URLs” to add multiple URLs on which the experiment should start off for the visitors. Let’s take two cases for some more clarity:

  • If you want that the experiment to start on all the pages on your website, wherever the user lands, then punch in this pattern – http://yoursite.com*.
  • If you’ve city specific home pages for users, and you want the experiment to only start on the homepage, then use a regexp pattern like this – http(.*?)://yoursite.com/(city1|city2|city3).

This is how the “Settings > URLs” section looks on VWO:

vwo campaign setting urls

2) Create Variations

During the creation process, once you enter your URLs, you’ll be able to create variations and make appropriate changes for those variations right there in their live visual editor. You can move components/elements around, make textual (as well as image/video based) changes, remove/add elements and if you know coding then you can also add/edit HTML, CSS and JavaScript code.

VWO live visual editor

But what if the changes are too complicated to be able to make via the live visual editor ? Sure, some technical guy will need to be involved but what if it is too messy to make all the complicated changes via the editor itself ? Let me tell you what I do in such cases.

In the editor you see a list of variations which on clicking will show you a small option called “Edit code”. Clicking on this will open up a code editing section, where I usually set a custom cookie using JavaScript. So when the user lands on to the specified URLs where the experiment starts for them, the JS code is executed which sets the custom cookie that can be accessed both in the frontend as well as backend (in the subsequent requests). Basis this new cookie that is set, I make corresponding changes in the UI using JS or even spit different HTML code directly from the the backend when/if required. I can now also track data differently from the frontend/backend if required. Of course VWO will also set its own cookie to track the experiment as well as the variation of which the visitor is a part but there’s no point in using their cookie as you’ll either not know their names and even if they have some sort of standard VWO object that stores such information (which they do, try _vwo_exp in console) that you’d be able to use, I think you’ll just be better off with a simple custom cookie that you set yourself and work with that.

// in Control's code editor
vwo_$(function () {
  document.cookie = 'custom_control_cookie=1; path=/; domain=codetheory.in'; // can set the expiry too if required!
});

// in Variation1's code editor
vwo_$(function () {
  document.cookie = 'custom_variation1_cookie=1; path=/; domain=codetheory.in'; // can set the expiry too if required!
});

3) Goals and Traffic Distribution

Once you’re done with creating your variations and assigning changes to them via code or the editor itself, you’ll be prompted to create your Goals in the creation process. We’ve already discussed goals before. Goals are what you want to track whether the users actually accomplish or not as they matter to your business. You’ve certain objectives for your visitors that you want them to achieve. This is how you’d create Goals on VWO (from the creation process or “Settings > Goals” after creation):

vwo create goals section

Very simple, yet powerful way of tracking different types of conversions and later based on that data in the reports, announcing the winning variation.

Also don’t forget to specify your traffic distribution. You’ll find it as a small option below (I think they could have made it a lot more prominent). Here’s a screenshot (from the last step of the creation process or “Settings > Others” after creation):

vwo traffic split

4) Start Campaign

Once you’re done, go ahead and start your campaign. Hooray! You just learnt how to create A/B experiments which will run fantastically and you can always find your way back to the campaign settings and change them as you deem necessary.

5) Analyse Results/Data

As your campaign runs for some time and visitors accomplish the goals, you’ll be able to see the reports in the “Detailed Report” section and compare them either by Goals vs Variations or the other way round. Here’s a screenshot of one of my campaign reports:

vwo detailed report

(They also have a nice graph that I’ve deleted from the screenshot in order to accommodate the report.)

The reports are simple to understand, but for those who’d want to be sure about everything by exploring a little more (although not required), feel free to go through these two small VWO knowledge base links:

That’s it! That’s all there is to implementing A/B testing on your website.

Further Notes on GA and Optimizely

If you know how to setup an experiment in VWO, then you should pretty much be able to do the same in Optimizely. It’s not much different. Conceptually and module-wise it’s all the same.

Note: Although VWO only does website A/B (split) testing as of now, Optimizely is also into mobile apps split testing, which is cool! So yeah, not only can you experiment multiple variations on your website, but you can also do the same in mobile apps as well.

In Google Analytics, you’ve got Content Experiments that can be used to do split testing. If you just go to your website profile/view in analytics and click on “Behaviour > Experiments”, you’ll find that section. This is where you’ll have to create and view reports for your experiments. Of course it lacks a lot of the features you’ll find in VWO or Optimizely.

The creation process is fairly simple, it has about 4 steps:

ga new experiment

The first step asks you to name the experiment and specify your primary Goal (or objective) for the experiment. You can set other important variables like what percentage of traffic do you want to allocate to your experiment, for how long should it run (2 weeks minimum for instance), set a statistical confidence level (95% at least) and whether to disable multi-armed bandit optimization or not (more on this in a bit).

ga configure experiment

If you look at the image above (the second step), it’s basically ask you to specify original (control) and variations details which’ll mostly be the names and URLs for them. If you’re relating this to a split URL test, then you’re right! The easiest (or default) way to setup experiments on GA is actually the split URL way. You can surely setup a non-split URL testing (without redirection) by writing some JS code (yes! you should know JS) in your origin/control page, which we will discuss in a bit.

In the third step you’ll be presented with a small JavaScript snippet that you’ll have to copy paste right after the opening of your <head> tag. This is different from your normal GA tracking snippet that you had pasted before the </body> closing tag. You’ll also be able to see your Experiment ID and Experiment Key here.

The final (fourth) step will allow you to validate the presence of this snippet (from the previous step) on your control and variations and finally start the experiment.

By default the way experiments work in GA is that once the user lands on the original page, GA will either redirect the user to one of the variations or just refresh the original page. With the refresh or redirect, GA also appends a GET param to store the experiment key, ID and variation number information. So the URL will look something like this:

# Refresh (original)
http://codetheory.in/?utm_expid=65042302-0.g3WqZjLOQB6vC19Dpq_mwA.0
# Redirect (variation)
http://codetheory.in/variation1/?utm_expid=65042302-0.g3WqZjLOQB6vC19Dpq_mwA.0

You can surely avoid these params in your URLs and also the split URL testing by spending some time integrating the Experiments API into your JS code. It’s very simple, go ahead and read this client-side experiments without redirection document. In essence, what you do is punch in random data for your variations while setting up the experiment and then make a call to a method available (cxApi.chooseVariation() currently) from their JS based API SDK to select the chosen variation for the visitor. Basis the variation index (0 if control, 1 if V1, 2 if V2), you can make changes to your UI and the rest of the data tracking will be automatically handled by GA.

You can also have a server-side implementation. Let’s see what all will be involved in such an implementation:

  • When the user visits the URL where the experimentation should start, you have to decide whether the visitor should be included in the experiment or not based on the URL as well as the traffic percentage you are including in your experiment. It has been explained well with sample codes here. You’ll also have to decide which variation to show (again sample code is there) and ensure to track all these data using cookies.
  • Once the decisions have been made, you’ll have to let GA know about the experiment ID and variation index so that it can store this data and also show reports.
  • Retrieve the experiment data for visitors periodically and store them in your own database.

The number of things you can/should do sounds cool but I highly doubt you’ll ever need to go this route. But it’s good to know!

In the “Behaviour > Experiments” section, you should be able to compare your variations by going through the reports that GA generates by tracking all the data. Here’s a sample report (fairly easy to understand, no need to read some sort of doc for it):

ga experiment report

Multi-Armed Bandit Experiments

Note: The idea is to just scratch the surface of this concept but not explain it from top to bottom as that’s an advanced subject and will become a different guide in itself.

By default GA employs multi-armed bandit optimization. You can surely disable it while creating the experiment as mentioned earlier. What this optimization basically does is, it checks the different metrics (goals are an important one) of the experiment’s variations twice in a day and starts sending more traffic to the variation that performs better out of all. This way you loose less conversions as the poorly performing variations start loosing traffic. Pretty cool, right! This is currently not there in the other tools.

Although you wouldn’t want to care about it at this stage (unless you already do a lot of A/B tests in life and are looking for ways to optimize the methodology), but just incase if you do then here are some interesting links around multi-armed bandit approach (aka epsilon greedy) – Google support and some HN discussions. If you do care then not only will you enjoy the articles in those hacker news links, but also the whole discussions. There are a few interesting comments around how multi-armed bandit is different from A/B testing and how you can use both of them together. Hence, they shouldn’t be compared. A/B testing is more geared towards cases where the objective is to find the best performing version of your change whereas the goal of bandit is to optimize the given/current content. What this means in practise is that what types of video suggestions should youtube be showing on the right hand side of the main video that you’re watching or what products should rank above others on ecommerce sites and made more discoverable and hence purchasable, are all part of bandit optimization.

Also this comment is a good read that states some significant issues with the multi-bandit approach from the perspective of a site owner/company. Although it also has its own debate and oh such is life!

How it Works (Technical Details) ?

Finally we’re on to the technical aspects of how A/B testing works in the browser at least. It’s quite simple actually. The A/B testing softwares will have to track the users by assigning them the experiment ID and the variation ID of that experiment. This data has to be “remembered” and will require a client-side storage system. Softwares typically and heavily make use of cookies for this purpose. In this case, such cookies (set by a script loaded from vwo.com for instance) are known as third party cookies. Just fyi, first party cookies are those that you’d have set yourself (a.com setting cookies for a.com on a.com).

So when you land on example.com which is running a split test using one of the tools we’ve discussed before, the tools set multiple cookies to keep track of the user and maintain all sorts of experiment related information about him. From generating a unique client-id against which the data will be updated on their servers to assigning the experiment IDs and variation data, cookies are just a simple storage mechanism that can help track the user’s engagement.

Note: At this point you might wonder what if you have two experiments with different variations whose impact on the UI are conflicting ? Since different cookies for the different experiments will be set, you’ll have UI conflicts if the different experiment variations are supposed to affect the same element/component. So before starting a new experiment, always make sure that such case won’t arise by reflecting upon all the previous experiments that you had started and are still running. If there are other team members who also run experiments on the same property then make sure to talk to them and discuss any possible conflicting scenario. This is very important as the data that you gather eventually will all be useless because they would have been gathered over conflicting set of UI elements/components that the visitors ended up seeing.

In case of a split URL test too (on the same domain), cookies are dropped and the visitors are tracked further. But what happens when the control and variation are on different domains ? How would a.com (control/original) set a cookie for b.com (variation) when it redirects a visitor to that version ? What I mean is let’s say the user lands on a.com and the tool decides that he should belong to V1 which is b.com and hence he’ll be redirected. But after the redirection how will b.com have the details regarding the experiment that was started on a.com and remember that the user belongs to that same experiment. Well, you cannot:

  • Directly execute a document.cookie = 'test=1; domain=b.com' from a.com to set all sorts of cookie on b.com for user tracking. Browsers won’t allow that and you know why (it wouldn’t make sense if a third-party domain could set n number of useless cookies for your domain).
  • Create an iframe with b.com as source and then access its document object to set a cookie. Accessing the DOM of a cross-origin frame is not permitted by browsers.

But you can:

  • Use the postMessage() API to enable cross-origin communication. So VWO’s script on a.com could create an iframe with b.com as the src and then set a cookie for b.com using the postMessage() API – basically postMessage() has to be called from a.com and in the message event listener on b.com’s iframe window, the cookies have to be set. But what if the iframe takes a long time to load and by that time VWO redirects the user to the variation (b.com) ? The browser won’t get the chance to set the cookie inside the iframe and that exactly is the issue with this method my friend!

So what to do ? A third-party server can be used (can be yours or someone else’s). In this case we’ll just discuss how VWO does it (in a simplified manner though). That’ll give you a good idea of how to do it in general too. When you land on a.com, VWO’s script will make a call to say dev.vwo.com/tpc to set a third party cookie (on a.com) to contain the experiment and variation ID. Imagine a request like this:

# POST request to
http://dev.vwo.com/tpc?name=vwo_exp_30_split&value=2&days=100
# Response will contain 'Set-Cookie: vwo_exp_30_split=2; expires=Mon, 11-Jul-16 19:03:59 GMT'

This cookie gets set on dev.vwo.com. Then a redirection is initiated to b.com (variation) on which the vwo snippet assigns a unique client ID (generated server-side) to a JS variable – _vwo_uuid_30=03EEC64F4AC572B73E5BFC966D94B379. Then it makes another third party cookie call to dev.vwo.com/tpc with this unique ID as the new parameter. So something like this happens:

# POST request to
http://dev.vwo.com/tpc?name=_vwo_uuid_30&value=03EEC64F4AC572B73E5BFC966D94B379&days=365
# Response headers will contain 'Set-Cookie: _vwo_uuid_30=03EEC64F4AC572B73E5BFC966D94B379; expires=Sun, 02-Apr-17 19:06:33 GMT'

Now imagine the user goes back to a.com, since that is the control URL, the user will again get re-directed to b.com (because it will have the vwo_exp_30_split=2 cookie set earlier) and the same uuid (03EEC64F4AC572B73E5BFC966D94B379 from b.com) will be set on a.com too because this time when the vwo snippet tries to load from dev.vwo.com/snippet.php (dynamic) for instance, it'll get the uuid cookie (in request headers) and set the same client ID cookie for a.com too. Had the user not gone to just a.com/ but a.com/some-page, then the same process would have repeated of the vwo snippet getting the uuid cookie in the request headers and then setting that for a.com. If you notice the experiment ID is already available from the uuid cookie name which is _vwo_uuid_30 (30 in this case) and the variation was already there in the vwo_exp_30_split cookie value (2 in our case). Do keep in mind that when the user landed on b.com after a redirection, the tpc call created a cookie to identify the user uniquely and the variation number/index was also known because the snippet.php got that in the request and could now set the same cookie for b.com. The snippet.php is basically an endpoint that'd spit dynamic JS code (response content-type is text/javascript) with the help of a dynamic programming language (PHP in this case).

That's all that is required to track a user to show him the right changes on the webpage/site. Clever right?! Surely the chances of you doing cross-domain URL testing would be rare but it is just a good piece of information to know.

Bonus

On SEO

Some people sweat the fact that different variations in an experiment either in a split or a split URL test with different content will adversely impact their ranks in google search results. Well, if you follow certain (very simple) guidelines, they you really don't have to fret about it. Don't believe me ? Believe Google!

So basically (summarizing that articke), if your variation is hosted on a different URL, use the noindex meta tags there or modify your robots.txt to not index that page in Google search results. Also the rel="cannonical" helps when you can't use noindex meta tags. Also keep a check on your redirects, they should be 302 (temporary) redirects or javascript ones. The last portion of that article is important which says most of the changes that you do in a split testing will probably not affect your search engine results as the content won't change but mostly the user interface which is generally fine.

Statistical Significance/Confidence

statistical significance

Most of the tools you use will have this feature called statistical significance indicator. This value will basically tell you whether the uplift in a variation over the others is actually due to their differences and definitely cannot be attributed to co-incidence/chance/accident. The convention is to call a variation statistically significant or confident if the value is 95% or above. Not sure if there's a rule behind that, but it's just a convention that is followed. My suggestion would be to not make your variation live as the updated version unless your test reaches statistical significance. At times this might not happen, then you'll have to use your intellect or experience to decide whether you should actually make your variation live or just scrap it. For instance, a variation will not lead to an uplift but might have an aesthetically better design according to you, your designer and the other people in the team. Then definitely go for it!

There will be times when your tool will recommend you to run your tests for 8 weeks or basically a lot of time to reach that statistical threshold (95%). This might happen if your traffic is too low. Low traffic takes more time to statistically prove that the uplift or dip is not due to some randomness. High traffic takes lesser time to prove that the uplift/dip is due to the actually differences and not just randomness. So the two important factors deriving statistical significance is traffic count and time.

VWO has an online calculator that you can use to calculate statistical significance basis your data. They've also well-written how they calculate it and made an excel sheet available containing another calculator with those formulas. Some mathematical nerds might also like this article.

A/B/B Testings

A/B/B seems weird, so what is it ? At times I've done a type of testing where there's a control (A) and a variant (B) and what I do is setup another variation in the campaign which is just a replica of the variant. There's no difference between V1 and V2, just that I create two different variations that are the same. Such tests are called A/B/B (self-explanatory now) tests. But why would one do so ?

This is done more from an experimentation (with data) point of view and add another layer of certainty to the winning variation. When we moved from Optimizely to VWO, we wanted to make sure VWO's tracks visitors properly and reports sane numbers. We wanted to ensure that there's no vague differences (or mis-tracking) between the control and variation and what it reports is actually very close to the actual difference. A/B/B is helpful in such cases.

If B/B are almost close then you can be sure that VWO is doing a good job at tracking and the difference between the control and variation in terms of numbers can be believed. Where as if B/B have a good deal of difference (imagine 30%) even after you've collected two weeks of data (might happen rarely) then you know that such a difference might be applicable between A (control) and the worse performing B (variant). So in that case you'll have a range by which the data can be off and act accordingly in your analysis. It might be hard to imagine such cases at the moment but it does happen at times and that's when you end up spending more and more time analysing data here and there and dig further into other tools like Google Analytics to cross check different metrics.

For similar reasons some people even do A/A/B/B testing. Of course that'll require more traffic, once you've high traffic you can be lucky and have as many variations as you want!

Summary

That was quite a bit but fairly easy to digest :). Now you exactly know what A/B (split) testing is and how it can help you setup your goals (objectives for visitors) on which you'd always want to increase conversions with various changes on your websites or apps. You also know how it all works technically including cross-domain split URL testing (which is sort of what the initial motivation was to write this article, but I ended up writing all about split testing). You learnt about a couple of different tools that you can start using right away to kick off split testing on your platforms. So what are you waiting for ? Start testing right away and if you face any issues feel free to ask questions in the comments section below.

Note: Setting up and running tests are all good but it is also very important to go through your result reports properly and verify which one is the winning variation. Lots of people make mistakes by choosing the wrong variation as the winner because the reports can be misleading at times (which gets rectified if you follow certain rules). A good example is that, in the first 2 days V2 might convert better than V1 but over the next 2 days which constitute a weekend, V1 might perform overall better than V2. Yes that happens a lot! I've spent quite some time going through various resources by people who have a fair amount of experience in A/B/n testing and will summarize my learnings and thoughts around analyzing data in the next article.

If you need help with A/B testing on your website, contact me.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download

Author: Rishabh

Rishabh is a full stack web and mobile developer from India. Follow me on Twitter.

5 thoughts on “Guide (Introduction, Implementation and How it Works) to A/B Testing and Split URL Testing (VWO, Optimizely, Google Analytics)”

  1. Hi – great article!

    I load my VWO script in asynchronous mode and my front-end scripts are loaded in the section of my website but using “defer” for better page loading results. However, I notice that my VWO test variant cookie is set after my front-end code is loaded in the website, which makes it impossible for me to access the cookie and choose the correct front-end code to load.

    How do you load your front-end and VWO scripts?

    Thanks!

  2. I’m an expert writer who loves to bring smiles to people’s face.

    Writing is what I do for a living and I am so passionate about this. I have worked with several organizations whose mission is to help people solve problems.
    I love traveling and have visited several places in the past few years.
    I’m happy to have written several books that have contributed positively to the lives of many. My works are available in several parts of the world. And I’m currently working with companies that help people save time. Being a part of this team has open more opportunities for me to excel as a writer. I have worked with different people and met many clients as a writer.
    I can handle any kind of writing project and provide nothing but the best. People come to me all the time to ask if I can solve their writing problems and I accept. I find pleasure in helping them to solve their problems as a professional.

    Academic Writer – Eva – http://www.academic-plus.com Team

Leave a Reply

Your email address will not be published. Required fields are marked *