Hypothesis Testing in SEO & Statistical Significance – Whiteboard Friday
Posted by Emily.Potter
A/B testing your SEO changes can bring you a competitive edge and dodge the bullet of negative changes that could lower your traffic. In this episode of Whiteboard Friday, Emily Potter shares not only why A/B testing your changes is important, but how to develop a hypothesis, what goes into collecting and analyzing the data, and thoughts around drawing your conclusions.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. I'm Emily Potter, and I work at Distilled over in our London office. Today I'm going to talk to you about hypothesis testing in SEO and statistical significance.
At Distilled, we use a platform called ODN, which is the Distilled Optimization Delivery Network, to do SEO A/B testing. Now, in that, we use hypothesis testing. You may not be able to deploy ODN, but I still think today that you can learn something valuable from what I'm talking about.
Hypothesis testing
The four main steps of hypothesis testing
So when we're using hypothesis testing, we use four main steps:
- First, we formulate a hypothesis.
- Then we collect data on that hypothesis.
- We analyze the data, and then...
- We draw some conclusions from that at the end.
The most important part of A/B testing is having a strong hypothesis. So up here, I've talked about how to formulate a strong SEO hypothesis.
1. Forming your hypothesis
Three mechanisms to help formulate a hypothesis
Now we need to remember that with SEO we are trying to look to impact three things to increase organic traffic.
- We're either trying to improve organic click-through rates. So that's any change you make that makes yours appearance in the SERPs seem more appealing to your competitors and therefore more people will click your ad.
- Or you can improve your organic ranking so you're moving higher up.
- Or we could also rank for more keywords.
You could also be impacting a mixture of all three of these things. But you just want to make sure that one of these is clearly being targeted or else it's not really an SEO test.
2. Collecting the data
Now next, we collect our data. Again, at Distilled, we use the ODN platform to do this. Now, with the ODN platform, we do A/B testing, and we split pages up into statistically similar buckets.
A/B test with your control and your variant
So once we do that, we take our variant group and we use a mathematical analysis to decide what we think the variant group would have done had we not made that change.
So up here, we have the black line, and that's what that's doing. It's predicting what our model thought the variant group would do if we had not made any change. This dotted line here is when the test began. So you can see after the test there was a separation. This blue line is actually what happened.
Now, because there's a difference between these two lines, we can see a change. If we move down here, we've just plotted the difference between those two lines.
Because the blue line is above the black line, we call this a positive test. Now this green part here is our confidence interval, and this one, as a standard, is a 95% confidence interval. Now we use that because we use statistical testing. So when the green lines are all above the zero line, or all below it for a negative test, we can call this a statistically significant test.
For this one, our best estimate is that this would have increased sessions by 12%, and that roughly turns out to be about 7,000 monthly organic sessions. Now, on either side here, you can see I have written 2.5%. That's to make this all add up to 100, and the reason for that is that you never get a 100% confident result. There's always the opportunity that there's a random chance and you have a false negative or positive. That's why we then say we are 97.5% confident this was positive. That's because we have 95 plus 2.5.
Tests without statistical significance
Now, at Distilled, we've found that there are a lot of circumstances where we have tests that are not statistically significant, but there's pretty strong evidence that they had an uplift. If we move down here, I have an example of that. So this is an example of something that wasn't statistically significant, but we saw a strong uplift.
Now you can see our green line still has an area in it that is negative, and that's saying there's still a chance that, at 95% confidence interval, this was a negative test. Now if we drop down again below, I've done our pink again. So we have 5% on both sides, and we can say here that we're 95% confident there was a positive result. That's because this 5% is always above as well.
3. Analyze the data to test hypothesis
Now the reason we do this is to try and be able to implement changes that we have a strong hypothesis with and be able to get those wins from those instead of just rejecting it completely. Now part of the reason for this is also that we say we're doing business and not science.
Here I've created a chart of when we would maybe deploy a test that was not statistically significant, and this is based off how strong or weak the hypothesis is and how cheap or expensive the change is.
Strong hypothesis / cheap change
Now over here, in your top right corner, when we have a strong hypothesis and a cheap change, we'd probably deploy that. For example, we had a test like this recently with one of our clients at Distilled, where they added their main keyword to the H1.
This final result looked something like this graph here. It was a strong hypothesis. It wasn't an expensive change to implement, and we decided to deploy that test because we were pretty confident that that would still be something that would be positive.
Weak hypothesis / cheap change
Now on this other side here, if you have a weak hypothesis but it's still cheap, then maybe evidence of an uplift is still reason to deploy that. You'd have to communicate with your client.
Strong hypothesis / expensive change
On the expensive change with strong hypothesis point, you're going to have to weigh out the benefit that you might get from your return on investment if you calculate your expected revenue based off that percentage change that you're getting there.
Weak hypothesis / cheap change
When it's a weak hypothesis and expensive change, we would only want to deploy that if it's statistically significant.
4. Drawing conclusions
Now we need to remember that when we're doing hypothesis testing, all we're doing is trying to test the null hypothesis. That does not mean that a null result means that there was no effect at all. All that that means is that we cannot accept or reject the hypothesis. We're saying that this was too random for us to say whether this is true or not.
Now 95% confidence interval is being able to accept or reject the hypothesis, and we're saying our data is not noise. When it's less than 95% confidence, like this one over here, we can't claim that we learned something the way that we would with a scientific test, but we could still say we have some pretty strong evidence that this would produce a positive effect on these pages.
The advantages of testing
Now when we talk to our clients about this, it's because we're aiming really here to give a competitive advantage over other people in their verticals. Now the main advantage of testing is to avoid those negative changes.
We want to just make sure that changes we're making are not really plummeting traffic, and we see that a lot. At Distilled, we call that a dodged bullet.
Now this is something I hope that you can bring into your work and to be able to use with your clients or with your own website. Hopefully, you can start formulating hypotheses, and even if you can't deploy something like ODN, you can still use your GA data to try and get a better idea if changes that you're making are helping or hurting your traffic. That's all that I have for you today. Thank you.
Video transcription by Speechpad.com
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!
Related Posts
Categories
- 60% of the time… (1)
- A/B Testing (2)
- Ad placements (3)
- adops (4)
- adops vs sales (5)
- AdParlor 101 (43)
- adx (1)
- algorithm (1)
- Analysis (9)
- Apple (1)
- Audience (1)
- Augmented Reality (1)
- authenticity (1)
- Automation (1)
- Back to School (1)
- best practices (2)
- brand voice (1)
- branding (1)
- Build a Blog Community (12)
- Case Study (3)
- celebrate women (1)
- certification (1)
- Collections (1)
- Community (1)
- Conference News (1)
- conferences (1)
- content (1)
- content curation (1)
- content marketing (1)
- contests (1)
- Conversion Lift Test (1)
- Conversion testing (1)
- cost control (2)
- Creative (6)
- crisis (1)
- Curation (1)
- Custom Audience Targeting (4)
- Digital Advertising (2)
- Digital Marketing (6)
- DPA (1)
- Dynamic Ad Creative (1)
- dynamic product ads (1)
- E-Commerce (1)
- eCommerce (2)
- Ecosystem (1)
- email marketing (3)
- employee advocacy program (1)
- employee advocates (1)
- engineers (1)
- event marketing (1)
- event marketing strategy (1)
- events (1)
- Experiments (21)
- F8 (2)
- Facebook (64)
- Facebook Ad Split Testing (1)
- facebook ads (18)
- Facebook Ads How To (1)
- Facebook Advertising (30)
- Facebook Audience Network (1)
- Facebook Creative Platform Partners (1)
- facebook marketing (1)
- Facebook Marketing Partners (2)
- Facebook Optimizations (1)
- Facebook Posts (1)
- facebook stories (1)
- Facebook Updates (2)
- Facebook Video Ads (1)
- Facebook Watch (1)
- fbf (11)
- first impression takeover (5)
- fito (5)
- Fluent (1)
- Get Started With Wix Blog (1)
- Google (9)
- Google Ad Products (5)
- Google Analytics (1)
- Guest Post (1)
- Guides (32)
- Halloween (1)
- holiday marketing (1)
- Holiday Season Advertising (7)
- Holiday Shopping Season (4)
- Holiday Video Ads (1)
- holidays (4)
- Hootsuite How-To (3)
- Hootsuite Life (1)
- how to (5)
- How to get Instagram followers (1)
- How to get more Instagram followers (1)
- i don't understand a single thing he is or has been saying (1)
- if you need any proof that we're all just making it up (2)
- Incrementality (1)
- influencer marketing (1)
- Infographic (1)
- Instagram (39)
- Instagram Ads (11)
- Instagram advertising (8)
- Instagram best practices (1)
- Instagram followers (1)
- Instagram Partner (1)
- Instagram Stories (2)
- Instagram tips (1)
- Instagram Video Ads (2)
- invite (1)
- Landing Page (1)
- link shorteners (1)
- LinkedIn (22)
- LinkedIn Ads (2)
- LinkedIn Advertising (2)
- LinkedIn Stats (1)
- LinkedIn Targeting (5)
- Linkedin Usage (1)
- List (1)
- listening (2)
- Lists (3)
- Livestreaming (1)
- look no further than the new yorker store (2)
- lunch (1)
- Mac (1)
- macOS (1)
- Marketing to Millennials (2)
- mental health (1)
- metaverse (1)
- Mobile App Marketing (3)
- Monetizing Pinterest (2)
- Monetizing Social Media (2)
- Monthly Updates (10)
- Mothers Day (1)
- movies for social media managers (1)
- new releases (11)
- News (72)
- News & Events (13)
- no one knows what they're doing (2)
- OnlineShopping (2)
- or ari paparo (1)
- owly shortener (1)
- Paid Media (2)
- People-Based Marketing (3)
- performance marketing (5)
- Pinterest (34)
- Pinterest Ads (11)
- Pinterest Advertising (8)
- Pinterest how to (1)
- Pinterest Tag helper (5)
- Pinterest Targeting (6)
- platform health (1)
- Platform Updates (8)
- Press Release (2)
- product catalog (1)
- Productivity (10)
- Programmatic (3)
- quick work (1)
- Reddit (3)
- Reporting (1)
- Resources (34)
- ROI (1)
- rules (1)
- Seamless shopping (1)
- share of voice (1)
- Shoppable ads (4)
- Skills (28)
- SMB (1)
- SnapChat (28)
- SnapChat Ads (8)
- SnapChat Advertising (5)
- Social (169)
- social ads (1)
- Social Advertising (14)
- social customer service (1)
- Social Fresh Tips (1)
- Social Media (5)
- social media automation (1)
- social media content calendar (1)
- social media for events (1)
- social media management (2)
- Social Media Marketing (49)
- social media monitoring (1)
- Social Media News (4)
- social media statistics (1)
- social media tracking in google analytics (1)
- social media tutorial (2)
- Social Toolkit Podcast (1)
- Social Video (5)
- stories (1)
- Strategy (608)
- terms (1)
- Testing (2)
- there are times ive found myself talking to ari and even though none of the words he is using are new to me (1)
- they've done studies (1)
- this is also true of anytime i have to talk to developers (1)
- tiktok (8)
- tools (1)
- Topics & Trends (3)
- Trend (12)
- Twitter (15)
- Twitter Ads (5)
- Twitter Advertising (4)
- Uncategorised (9)
- Uncategorized (13)
- url shortener (1)
- url shorteners (1)
- vendor (2)
- video (10)
- Video Ads (7)
- Video Advertising (8)
- virtual conference (1)
- we're all just throwing mountains of shit at the wall and hoping the parts that stick don't smell too bad (2)
- web3 (1)
- where you can buy a baby onesie of a dog asking god for his testicles on it (2)
- yes i understand VAST and VPAID (1)
- yes that's the extent of the things i understand (1)
- YouTube (13)
- YouTube Ads (4)
- YouTube Advertising (9)
- YouTube Video Advertising (5)