## How Does RankScience Conduct SEO Split Tests?

SEO Split testing is a method of comparing two versions of a web page or app against each other to determine which one performs better. Split tests (or A/B tests) are experiments where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

The usual way people calculate the effectiveness of deployed SEO A/B tests is by counting data in a bayesian structural time series. This is an *OK* approach but it does have some negatives. This system for example is very slow as it requires collecting data both before and after applying treatments. Also, when it comes to measuring the interactions for multi-factor experiments, this system is not necessarily the most effective.

While it can be very tempting to whip up a google spreadsheet and a free online A/B test calculator to measure the effectiveness of the deployed tests, SEO data is more complicated than that; keep in mind we’re not dealing with conversion rates only.

What we like to do instead of the bayesian time series is more statistically fundamental; a generalized linear model with a negative binomial distribution.

This SEO split testing strategy relies heavily on linear regression and the fact that it can prove a causal relationship between two variables. Linear regression is a linear approach to modelling the relationship between a dependent variable and one of more independent variables. When we have more than one explanatory variable, the process is called multiple linear regression.

Analysis between two variables may focus on any association between the variables, the value of one variable in predicting the other and the amount of agreement between variables. Regression analysis focuses on the form of the relationship between variables.

The single most important equation in all of applied statistics is this:β=(X′X)−1X′YIt’s an equation that tells us how to estimate a linear relationship between a variable of interest Y (say, sales) and everything else X (say, the user’s web browser, operating system, anything else we can think of). If we plug some data into this equation, we can calculate β then use it to predict the level of Y for some values of X

i.e. y=xβ

That is, once we’ve estimated β, we can predict how much a user will buy (y) based on their observable characteristics (x). Note that x and β are vectors, so multiplying them together is really just multiplying the individual components and adding up the results.

**Let’s relate this to SEO now.** With SEO, the observable characteristic (x) is an HTML change, and the outcome variable (y) is either clicks or impressions. We run split tests to find the value β for each different split test variant. The split test variant with the best β will yield the best (y) outcomes, which means the variant with the highest β is the winning HTML change that will maximize SEO.

Here’s an example… Let’s say we want to maximize clicks (who doesn’t?). That means (y) is our outcome variable. The variables that we have control over (which is x) includes the title. One value for our titles could be *“{{product}} Online For Sale”* where {{product}} is the name of the product on each landing page. Another title could be “{{product}} Discounts For Online Orders”. Both of these title formats are different values of (x) which will lead to a different value of (y). Once we decide the values of (x) and empirically find the values of (y) by collecting data, we can derive the remaining β value to declare a split test winner.

## Here’s a simplified breakdown of how the RankScience SEO split testing process works:

- Randomly split all the pages into two groups
- Assign the first title format to the first group and the second title format to the second group – this is the (x) input variable
- Wait and collect click data over time from the SERPs
- After gathering enough data for both title formats, we use that data to see the values (y) for the outcome variables for each title format. Note that each different title format is a different (x), and each different (x) has its own (y)
- Now that we know the values of (x) and (y) for each variable, the last unknown variable is β, which explains the degree of correlation between (x) and (y). When we see that one title format’s β correlation variable is greater than the other title format’s β variable, we declare that title format the winner

Even though correlation typically does not imply causation, correlation with linear regression does in fact imply causation for SEO split testing as long as the pages are randomly split into different groups. The randomness ensures that all aspects of the pages are controlled except for the (x) variable. Using the scientific method, when there’s only one aspect of the page which differs between two sufficient samples and the two samples have differing outcomes, the tester can mathematically conclude that difference of the one aspect led to the different outcome. So in this case, **if all the pages are randomly split and the only consistent different between them is the title format, then you can ascribe any SEO performance difference between the two groups to the different title**.

## Ok… So How does this relate to what RankScience does for SEO?

When it comes to optimizing for organic traffic, we can use SEO split testing to pit two variations with a single differentiating element against each other, whether it is meta titles, descriptions, the copy, or even certain in-line CSS design elements within the site. Split testing also gives us a lot of data that we can then analyze in order to plan more aggressive advertising campaigns or to focus on a specific product or page.

## How do clicks, impressions for individual URLs come into play?

Tracking individual results and website performance data boils down to acting on the areas of opportunity found during our SEO audit process as well as evaluating the needs of the client. Clicks and impressions gives us a very detailed outlook that allows us to calculate performance improvements as well as the chance to identify pages that are under-performing.

Clicks and impressions also gives us a very important metric that is used by webmasters, marketers and SEO specialists all across the globe; Click-through-rate or CTR. A simple equation that tells us the percentage of people visiting a website or interacting with a page in a specific way like filling a form, opening a link or whatever needs the customer may have. A page with high CTR is generally a good indication that the user finds the content relevant or interesting which in turn leads to increased sales or interactions with the target customer.

## Can we break out an example experiment and how results were calculated?

The typical process of implementing split testing on to a new page or website involves the following steps:

- Identify and isolate website page variable(s) for split testing; some of the most common variables used in SEO split include:meta titles, meta descriptions, header text and tags (h1, h2, h3), the copy, design elements like the position, color and design of our client’s social media buttons or CTAs.
- Establish a control page and build webpage variations. Once we have selected what variables we want to change a test, we need to apply the suggested changes to these specific variables while limiting our changes to a low number of variables to truly determine which elements result in better performance or increased metrics.
- Track and measure results. Using the data obtained in phase B of our implementation process, we can then use our proprietary analytics suite and performance-testing software to determine which results work best for our client and to identify areas of opportunity such as trackable keywords, geo-targeting, advertising campaigns and more.

Using the data obtained in our previous three phases we can now determine which pages are the “winners” and which pages are the “losers”.

## Social Profiles