Posted by randfish

We've arrived at one of the meatiest SEO topics in our series: technical SEO. In this fifth part of the One-Hour Guide to SEO, Rand covers essential technical topics from crawlability to internal link structure to subfolders and far more. Watch on for a firmer grasp of technical SEO fundamentals!


Click on the whiteboard image above to open a high resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome back to our special One-Hour Guide to SEO Whiteboard Friday series. This is Part V - Technical SEO. I want to be totally upfront. Technical SEO is a vast and deep discipline like any of the things we've been talking about in this One-Hour Guide.

There is no way in the next 10 minutes that I can give you everything that you'll ever need to know about technical SEO, but we can cover many of the big, important, structural fundamentals. So that's what we're going to tackle today. You will come out of this having at least a good idea of what you need to be thinking about, and then you can go explore more resources from Moz and many other wonderful websites in the SEO world that can help you along these paths.

1. Every page on the website is unique & uniquely valuable

First off, every page on a website should be two things — unique, unique from all the other pages on that website, and uniquely valuable, meaning it provides some value that a user, a searcher would actually desire and want. Sometimes the degree to which it's uniquely valuable may not be enough, and we'll need to do some intelligent things.

So, for example, if we've got a page about X, Y, and Z versus a page that's sort of, "Oh, this is a little bit of a combination of X and Y that you can get through searching and then filtering this way.Oh, here's another copy of that XY, but it's a slightly different version.Here's one with YZ. This is a page that has almost nothing on it, but we sort of need it to exist for this weird reason that has nothing to do, but no one would ever want to find it through search engines."

Okay, when you encounter these types of pages as opposed to these unique and uniquely valuable ones, you want to think about: Should I be canonicalizing those, meaning point this one back to this one for search engine purposes? Maybe YZ just isn't different enough from Z for it to be a separate page in Google's eyes and in searchers' eyes. So I'm going to use something called the rel=canonical tag to point this YZ page back to Z.

Maybe I want to remove these pages. Oh, this is totally non-valuable to anyone. 404 it. Get it out of here. Maybe I want to block bots from accessing this section of our site. Maybe these are search results that make sense if you've performed this query on our site, but they don't make any sense to be indexed in Google. I'll keep Google out of it using the robots.txt file or the meta robots or other things.

2. Pages are accessible to crawlers, load fast, and can be fully parsed in a text-based browser

Secondarily, pages are accessible to crawlers. They should be accessible to crawlers. They should load fast, as fast as you possibly can. There's a ton of resources about optimizing images and optimizing server response times and optimizing first paint and first meaningful paint and all these different things that go into speed.

But speed is good not only because of technical SEO issues, meaning Google can crawl your pages faster, which oftentimes when people speed up the load speed of their pages, they find that Google crawls more from them and crawls them more frequently, which is a wonderful thing, but also because pages that load fast make users happier. When you make users happier, you make it more likely that they will link and amplify and share and come back and keep loading and not click the back button, all these positive things and avoiding all these negative things.

They should be able to be fully parsed in essentially a text browser, meaning that if you have a relatively unsophisticated browser that is not doing a great job of processing JavaScript or post-loading of script events or other types of content, Flash and stuff like that, it should be the case that a spider should be able to visit that page and still see all of the meaningful content in text form that you want to present.

Google still is not processing every image at the I'm going to analyze everything that's in this image and extract out the text from it level, nor are they doing that with video, nor are they doing that with many kinds of JavaScript and other scripts. So I would urge you and I know many other SEOs, notably Barry Adams, a famous SEO who says that JavaScript is evil, which may be taking it a little bit far, but we catch his meaning, that you should be able to load everything into these pages in HTML in text.

3. Thin content, duplicate content, spider traps/infinite loops are eliminated


Thin content and duplicate content — thin content meaning content that doesn't provide meaningfully useful, differentiated value, and duplicate content meaning it's exactly the same as something else — spider traps and infinite loops, like calendaring systems, these should generally speaking be eliminated. If you have those duplicate versions and they exist for some reason, for example maybe you have a printer-friendly version of an article and the regular version of the article and the mobile version of the article, okay, there should probably be some canonicalization going on there, the rel=canonical tag being used to say this is the original version and here's the mobile friendly version and those kinds of things.

If you have search results in the search results, Google generally prefers that you don't do that. If you have slight variations, Google would prefer that you canonicalize those, especially if the filters on them are not meaningfully and usefully different for searchers. 

4. Pages with valuable content are accessible through a shallow, thorough internal links structure

Number four, pages with valuable content on them should be accessible through just a few clicks, in a shallow but thorough internal link structure.

Now this is an idealized version. You're probably rarely going to encounter exactly this. But let's say I'm on my homepage and my homepage has 100 links to unique pages on it. That gets me to 100 pages. One hundred more links per page gets me to 10,000 pages, and 100 more gets me to 1 million.

So that's only three clicks from homepage to one million pages. You might say, "Well, Rand, that's a little bit of a perfect pyramid structure. I agree. Fair enough. Still, three to four clicks to any page on any website of nearly any size, unless we're talking about a site with hundreds of millions of pages or more, should be the general rule. I should be able to follow that through either a sitemap.

If you have a complex structure and you need to use a sitemap, that's fine. Google is fine with you using an HTML page-level sitemap. Or alternatively, you can just have a good link structure internally that gets everyone easily, within a few clicks, to every page on your site. You don't want to have these holes that require, "Oh, yeah, if you wanted to reach that page, you could, but you'd have to go to our blog and then you'd have to click back to result 9, and then you'd have to click to result 18 and then to result 27, and then you can find it."

No, that's not ideal. That's too many clicks to force people to make to get to a page that's just a little ways back in your structure. 

5. Pages should be optimized to display cleanly and clearly on any device, even at slow connection speeds

Five, I think this is obvious, but for many reasons, including the fact that Google considers mobile friendliness in its ranking systems, you want to have a page that loads clearly and cleanly on any device, even at slow connection speeds, optimized for both mobile and desktop, optimized for 4G and also optimized for 2G and no G.

6. Permanent redirects should use the 301 status code, dead pages the 404, temporarily unavailable the 503, and all okay should use the 200 status code

Permanent redirects. So this page was here. Now it's over here. This old content, we've created a new version of it. Okay, old content, what do we do with you? Well, we might leave you there if we think you're valuable, but we may redirect you. If you're redirecting old stuff for any reason, it should generally use the 301 status code.

If you have a dead page, it should use the 404 status code. You could maybe sometimes use 410, permanently removed, as well. Temporarily unavailable, like we're having some downtime this weekend while we do some maintenance, 503 is what you want. Everything is okay, everything is great, that's a 200. All of your pages that have meaningful content on them should have a 200 code.

These status codes, anything else beyond these, and maybe the 410, generally speaking should be avoided. There are some very occasional, rare, edge use cases. But if you find status codes other than these, for example if you're using Moz, which crawls your website and reports all this data to you and does this technical audit every week, if you see status codes other than these, Moz or other software like it, Screaming Frog or Ryte or DeepCrawl or these other kinds, they'll say, "Hey, this looks problematic to us. You should probably do something about this."

7. Use HTTPS (and make your site secure)

When you are building a website that you want to rank in search engines, it is very wise to use a security certificate and to have HTTPS rather than HTTP, the non-secure version. Those should also be canonicalized. There should never be a time when HTTP is the one that is loading preferably. Google also gives a small reward — I'm not even sure it's that small anymore, it might be fairly significant at this point — to pages that use HTTPS or a penalty to those that don't. 

8. One domain > several, subfolders > subdomains, relevant folders > long, hyphenated URLs

In general, well, I don't even want to say in general. It is nearly universal, with a few edge cases — if you're a very advanced SEO, you might be able to ignore a little bit of this — but it is generally the case that you want one domain, not several. Allmystuff.com, not allmyseattlestuff.com, allmyportlandstuff.com, and allmylastuff.com.

Allmystuff.com is preferable for many, many technical reasons and also because the challenge of ranking multiple websites is so significant compared to the challenge of ranking one. 

You want subfolders, not subdomains, meaning I want allmystuff.com/seattle, /la, and /portland, not seattle.allmystuff.com.

Why is this? Google's representatives have sometimes said that it doesn't really matter and I should do whatever is easy for me. I have so many cases over the years, case studies of folks who moved from a subdomain to a subfolder and saw their rankings increase overnight. Credit to Google's reps.

I'm sure they're getting their information from somewhere. But very frankly, in the real world, it just works all the time to put it in a subfolder. I have never seen a problem being in the subfolder versus the subdomain, where there are so many problems and there are so many issues that I would strongly, strongly urge you against it. I think 95% of professional SEOs, who have ever had a case like this, would do likewise.

Relevant folders should be used rather than long, hyphenated URLs. This is one where we agree with Google. Google generally says, hey, if you have allmystuff.com/seattle/ storagefacilities/top10places, that is far better than /seattle- storage-facilities-top-10-places. It's just the case that Google is good at folder structure analysis and organization, and users like it as well and good breadcrumbs come from there.

There's a bunch of benefits. Generally using this folder structure is preferred to very, very long URLs, especially if you have multiple pages in those folders. 

9. Use breadcrumbs wisely on larger/deeper-structured sites

Last, but not least, at least last that we'll talk about in this technical SEO discussion is using breadcrumbs wisely. So breadcrumbs, actually both technical and on-page, it's good for this.

Google generally learns some things from the structure of your website from using breadcrumbs. They also give you this nice benefit in the search results, where they show your URL in this friendly way, especially on mobile, mobile more so than desktop. They'll show home > seattle > storage facilities. Great, looks beautiful. Works nicely for users. It helps Google as well.

So there are plenty more in-depth resources that we can go into on many of these topics and others around technical SEO, but this is a good starting point. From here, we will take you to Part VI, our last one, on link building next week. Take care.

Video transcription by Speechpad.com

In case you missed them:

Check out the other episodes in the series so far:


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!