Lessons From The Roundabout SEO Test
My favorite comedian of all time, the late great Mitch Hedberg, once told a joke about what he called “The Roundabout AIDS test.” Click Mitch’s mouth below to hear it:
While obviously not meant in any serious manner, the joke reminds me of my attitude towards SEO, or “Search Engine Optimization”. A lot of people spend an inordinate amount of time making sure all of their pages are specifically geared towards achieving a high ranking on Google. While I understand the business objectives of such an obsession, I find my own attitude towards SEO much more apathetic. I’m a lot more interested in how many people subscribe to Mike Industries than how many people typed in something like “expiring domain” and somehow ended up at my site.
So for that reason, my SEO activity is limited to my own little “Roundabout SEO Test”, which I perform a few times a year. It’s a very simple test and takes only a few seconds to execute. Here is the procedure:
1. Go to google.com.
2. Type in “mike”.
3. Hit return.
4. Take note of how high or low Mike Industries is on the list of results.
Yes, it seems a little narcissistic, and yes, it’s not a true measure of how well each page on this site is optimized for search engines, but it’s a general indication of how well or poorly this blog is doing and that’s really all I’m interested in.
In running this Roundabout SEO Test since creating Mike Industries last year, I’ve seen my ranking among Mikes climb from in the thousands, to in the hundreds, to the top 50, to the top twenty, and most recently to number 5.
Number 5 is great and I’m totally cool with it considering that my parents were unoriginal enough to christen me with the most popular name in the 70s, 80s, and 90s, but now that the list of “competitors” is down to 4, I thought I’d take a close look at why the pecking order on Google is the way it is. In other words, what are the factors which most affect search results in the real world?
First, a rundown of the combatants:
1. Mike Little
Household Name: No
Code Quality: Very good.
Incoming Links: 16,400
“Mike” appearances in source: 54
Notes: Mike is known for his contributions to WordPress, and like Matt Mullenweg, auto-Matt-ically appears in the sidebars of many WordPress blogs. He is also a diabolical genius in his ability to put 54 mentions of his name in his page source without being “spammy”. I fear any attempt to “outmike” him would require me being “spammy”. He is the king, and will probably remain that way.
2. Michael Moore
Household Name: Yes
Code Quality: Average. Old-school table layout.
Incoming Links: 18,500
“Mike” appearances in source: 13
Notes: Michael makes wildly popular, wildly controversial films that are praised and reviled the world over.
3. Mike Portnoy
Household Name: No
Code Quality: Oh my god. This is not even HTML.
Incoming Links: 417
“Mike” appearances in source: 1
was is the drummer for a now not defunct rock band called “Dream Theater”.
4. Mike DeWine
Household Name: Maybe in Ohio
Code Quality: Bad. Frames, tables, etc.
Incoming Links: 301
“Mike” appearances in source: 1
Notes: Mike is a senator from Ohio who apparently supports the frameset initiative.
5. Your self-absorbed host, me, Mike Davidson
Household Name: Ha!
Code Quality: Very good… minus the intentional invalidation stuff.
Incoming Links: 1,870
“Mike” appearances in source: 21
Notes: Works and plays well with others.
Ok, so there are two combatants that I’m particularly puzzled with here. See if you can guess which ones.
Number one is the republican Senator from Ohio Mike DeWine. His PR, Code Quality, Mentions, Links, and “number of mikes” in source are all less than mine, and yet, he’s number 4. The guy’s entire site is a frameset too, so let’s put to rest any suspicions that framesets disqualify you from good SEO. The only few things I can think of which may have led to Senator DeWine’s SEO ranking are 1) he has a .gov domain, which may yield an advantage over a .com, 2) of his measly 312 links, most of them might come from other .gov sites which have high Page Ranks as well, and 3) most people might link to him by name (e.g. vote for “Mike DeWine”) whereas many of my links might not even contain my name (e.g. “sIFR” technology is evil and kills kittens).
Ok, so Senator DeWine is the first curious result, but the second is much more confounding and — really — 90% of the reason why I decided to even write this article:
was is in a rock group called “Dream Theater” a long time ago and whoever is doing his site for him writes the worst HTML I’ve ever seen. Seriously. The worst. It’s so bad that I’m just going to go ahead and paste the entire source of his front page right here:
<TITLE>Mike Portnoy.com - The Official Website</TITLE>
<LINK title=default href="/inc/mp2.css" type=text/css rel=stylesheet>
[<a href="/default2.asp">Enter MP.COM</a>]
Yep, that’s the whole thing. The kid who coded the site didn’t even give poor Mike an HTML tag. I guess I don’t blame him considering he probably only got an acid-washed denim jacket and a pair of drumsticks out of it. I also find it ironic that there is actually a call to a stylesheet amidst this neanderthal mess, but whatever. It’s awful… and I won’t even get into the actual design of the site. We’ve already done that once this year. :)
*Note: It looks like Mike’s front page was updated (slightly) for the better just this week. Still no HTML tags but here’s what it looked like for the past two years (and when this test was run).
When I saw Mike Portnoy’s site at #3 among all Mikes and then peeped the corresponding code, I had a giant moment of doubt about the whole web standards movement. If the worst code any human being has ever produced gets a narrowly-known drummer to the front of 251 million Mikes on Google, is SEO even a valid benefit of writing standards-friendly code?
And furthermore, considering that said drummer has a PR of 5 and I have a PR of 8 (supposedly three orders of magnitude or 1000 times greater), does PageRank matter at all in search result ranking?
The only way we can find out is to test.
The Google Vacuum Tests
After witnessing the strange results from the Roundabout SEO Test, I set out to discover exactly what effect, if any, HTML code has on Google search results. The first thing I did was make up a word.
That word is “lodefizzle”… a nickname given to the legendary Stephen Lodefink of Finkbuilt.com during our days at the Disney Internet Group.
I created 15 files in a new directory on my site which contained the term “lodefizzle” in different contexts. Some had the word in the title element, some in the H1 element, some in the filename, etc etc etc. The idea was to see how treatment of this word affected search result ranking within Google.
Here are the results from the five tests performed (raw results here):
Function: To determine if using semantic tags like H1 boost ranking, and to what degree.
The important test case is 1b here. It’s a safe assumption that 1a would beat 1c (and indeed it does), or else we might as well quit our jobs and start using FrontPage. But the real question is, will one lousy extra instance of “lodefizzle” override the H1 tag’s advantage and bump 1b past 1a?
The answer: No.
Conclusion: The H1 tag does indeed assert some dominance and we can therefore assume you are better off using proper headings in your HTML documents.
Function: To determine if the mere appearance of tables and nested tables penalizes search result ranking, and to what degree.
2a – Contains “lodefizzle” once in the document.
2b – Contains “lodefizzle” twice in the document and a 14-deep nested table.
2c – Contains “lodefizzle” once in the document and a 14-deep nested table.
All cases are important here, but once again, the one to watch is the middle one. So we have two questions to answer:
Does the appearance of a nested table penalize you at all in search engine ranking?
The answer: Yes.
Does an extra instance of “lodefizzle” in the source override this penalty?
The answer: Yes.
Conclusion: The mere appearance of many nested tables in your code does not have a strong enough negative effect to be considered a drag on search engine ranking.
I imagine the only negative affect at all was because the ratio of “lodefizzle” to other content on the page was slightly lowered by the addition of more code. This, however, could be an argument to keep your code as small as possible.
Function: To determine if invalid code penalizes search result ranking, and to what degree.
3a – Contains “lodefizzle” once in the document, after a valid table.
3b – Contains “lodefizzle” twice in the document, after an invalid table.
3c – Contains “lodefizzle” once in the document, after an invalid table.
First, let me say that I expected all three of these results to be about the same, with 2b taking the cake only because of the extra mention. I’ve never really believed the claims of validatorians about bad code getting you booted off of Google.
So can invalid code get you penalized on search engines?
The answer: Yes, to a draconian degree, in fact.
3a is the only document which even shows up on Google. Whoa! A full-on blackballing from the Big G. We already know that 99% of the world’s web pages are invalid to some degree and they show up just fine on Google, so what happened here? Let’s look.
The valid table I created looks like this:
… while the invalid table looks like this:
The interesting part here is that while the source code of all three test cases looks pretty similar, the appearance of the pages in a browser is completely different. The invalid pages, in fact, don’t even render (at least in most browsers they don’t). The reason for this is the degree to which the code is malformed. The missing quote on the end of the “checkered” attribute is causing the entire page to fail and the word “lodefizzle” to not even appear on the rendered page.
What’s really interesting to me is that Google is doing one of two things (or both):
1. Somehow grading pages based on how they are rendered as well as how they are coded.
2. Simply counting the rest of the page as an attribute of the invalid table because the attribute is never officially closed off with an end quote.
There may also be other explanations to why this is happening, but this was the most interesting test in the bunch for me.
Conclusion: It’s not clear that validity helps search engine ranking, but it’s definitely true that certain errors in your code can get you completely removed from indexes.
Function: To determine if semantics are more valuable than raw phrase placement on a page.
The question here is, are semantics powerful enough to override a simple shift in placement of the keyword?
The answer: Yes.
4a shows up well before 4b in the search results, even though keyword placement on 4b is theoretically better.
Conclusion: The semantic effect of hierarchical HTML headings is a stronger factor in search rankings than more rudimentary measures such as physical keyword placement.
Function: To test if the advantage of clean, semantic HTML can be easily overridden by other factors.
5a – Contains “lodefizzle” in H1, H2, H3, and H4 tags.
5b – Contains “lodefizzle” four times in normal text and once in the page title.
5c – Contains “lodefizzle” four times in normal text and is linked once from my old site, mdavidson.com.
5d – Contains “lodefizzle” four times in normal text and once in the filename.
Since we already know that good semantics do matter to an extent, the question is: will any or all of the methods in 5b, 5c, and 5d be strong enough to usurp the proper use of hierarchical headings?
The answer: Yes, in all cases!
The winners, in order, were 5c, 5b, 5d… followed up by 5a.
*Note: Some time between when I ran these tests and now, the link between mdavidson.com and 5c ceased to be recognized by Google, causing 5c to drop to the bottom of the current search results. I have no idea why this happened, but for the purposes of this test, we’ll assume it didn’t.
Conclusion: Although good semantics are somewhat valuable in optimization, simple things like proper titles, descriptive filenames, and incoming links are dramatically more important.
Final analysis and lessons learned
Some of these results are commonsensical and some are not. None are in the least bit scientific, so please don’t tell me that. I already know that much better tests could be performed. The findings do support my initial suspicions about web standards as they relate to SEO though: that they matter about as much as a cheap umbrella in a hailstorm. That is to say: “kind of”.
Developers should write clean, semantic code as a matter of professionalism rather than search engine optimization. For good SEO, making your site sticky enough to attract quality incoming links is by far and away the thing to concentrate on.
In revisiting the list of five combatants above, it’s now easy to see why Mike Portnoy’s front page did not penalize him too much on Google. It renders in the browser (even without an HTML tag!) and his 44,000 mentions around the web are apparently enough to bump him into the #3 spot. “But he has a PR of only 5!” you say. Well, apparently that doesn’t matter a whole lot either. From what I’ve been able to gather, PR has a much greater effect on the link love you’re able to give than the search engine placement you yourself actually receive.
While the findings from this test explain the placements of the top three Mikes, they still don’t tell us anything about the republican Senator from Ohio, Mike DeWine. I seem to kill him in every measurable category and yet, he’s one slot ahead. The conspiracy theorist in me says that Larry and Sergei are closet puppeteers in the political process of this country, but the realist tells me there’s something else I haven’t considered. Perhaps it’s the unexposed “quality” of the incoming links… I don’t know.
What I do know is that, while this was a fun experiment, I won’t be mucking around with any more in-depth SEO tests.
The roundabout is good enough for me.