Category Archives: Search Engine Basics


SEO has some fairly easy to track KPIs. Our goal is to take actions that will result in higher search engine rankings, which result in more traffic, which in turn results in more conversions on our site.

These then are the KPIs of SEO:

  • Keyword Rankings for relevant terms
  • Organic traffic from search engines, including tracking what pages that traffic goes to
  • Conversions on-site from organic traffic. This can be direct sales in the case of ecommerce sites (in which case you’re measuring revenue from search engines), or RFQ form submissions, or email list sign-ups.

We’ll discuss how to actually measure all of these KPIs in the Analytics and Tools secion. But at the high level these are the KPIs that every SEO should be tracking.

A Few Words on Keyword Rankings

There has been a movement within the SEO industry for the last few years to move away from keyword rankings. This push is coming from the agency world in particular. The logic usually used is that clients tend to obsess over certain rankings and fail to look at the big picture, and that by focusing them on something other than keyword rankings… well, I suppose they’re less annoyed by their clients, or they don’t have clients complaining about only one single keyword and the client can focus more on the entire campaign.

I strongly disagree with the notion of ignoring keyword rankings.

Keyword rankings are one of the Key Performance Indicators for a good reason — improving rankings is the very heart of an SEO’s job. It is, in fact, nearly all of what we do. It is what the agency is being paid for! I understand that clients can become too focused on rankings (I’ve worked in agency life too) but a better solution is to educate clients, not throw out your ability to measure your results.

I would never trust a SEO who did not track key rankings and did not try to tie actions to ranking improvements. That SEO would never really know if they were making a difference on the site.

Keep in mind that search engine usage increases every year. Each year there are more searches than the year before. So, depending on your industry, if you did nothing at all and your rankings stayed exactly where they are, you would still get more traffic each year.

Comscore measure of growth of searches

Desktop Google searches, in millions. Source: comScore

Growth of mobile searches on Google

Mobile Google searches. Source: Google

So an SEO agency that doesn’t believe in keyword rankings could very likely be doing nothing useful at all, and still be able to report year over year organic traffic growth. And you’d better believe they’ll take credit for that growth. I have never in my life seen a SEO agency who said, “Well, some percentage of that growth is just natural and not due to anything we did.” Oh no — if there’s a 10% year over year growth, the agency is taking credit for all 10%.

Similarly, if traffic goes down, keyword rankings give you the ability to troubleshoot. If all your rankings are about the same (and you are tracking the right rankings) they you know it’s a seasonal decline, or just a random decline in search behavior. If your rankings do go down, you can pinpoint what keywords fell and begin to diagnose why: perhaps some competitors have been upping their game — if so, you should know about it! Or perhaps a tactic that you used gave only a brief freshness lift, and didn’t have lasting effects. You need to know that too, so that you don’t keep spending your time doing the same useless thing across the entire site.

I like to think that the agencies who don’t believe in keyword rank tracking really are just frustrated with clients who don’t get the big picture — but if an agency wanted to scam their clients, this is also exactly how to do it.

Keyword rankings are a KPI because if you get a traffic increase without a corresponding keyword ranking increase, then you were not responsible for that increase. Similarly, if keywords increase in ranking that you were not working on, odds are good that you were not responsible for that increase either.

The flip side of this is that increasing rankings is not enough. It has to be rankings that people are actually searching for in decent numbers: thus we also track organic traffic. And furthermore it has to be rankings that matter to your site — if you sell sneakers, ranking and getting traffic for “snowmen” isn’t going to be helpful to your business: thus we also track conversions from organic traffic.

If you leave any one of these KPIs out of the picture, you lose the ability to accurately measure the impact of your SEO efforts.

The 5 Things SEOs Do

To excel at SEO you need a strong understanding not just of individual tactics, but the big picture as well. Knowing how everything comes together and how much each piece matters will help you keep your efforts focused on the things that make the biggest difference. This is what will keep you from chasing pennies down the street where there’s gold laying around. It’s the difference between randomly pursuing tactics and best practices and developing a logical strategy.

This page is the 10,000 foot view of SEO; the broad structure that all our tactics are built around. Here we’re going to look at the five things SEOs can do that actually matter, and in the next page we’ll look at the Key Performance Indicators (KPIs) that we can use to measure progress.

The 5 Things SEOs Do

Obviously the goal of SEO is to get more relevant traffic from search engines. Ultimately just about everything we do in SEO falls into one of these five categories, because just about everything that can actually boost your organic traffic falls into these activities. If you’re doing something that doesn’t fall into these categories, odds are that you aren’t doing SEO.

  1. Increase Crawlability & Indexation
  2. Increase Topicality
  3. Increase Authority
  4. Increase User Metrics that Google Measures
  5. Decrease Risk of Penalties

Let’s take a look at each of these activities in more depth.

Crawlability & Indexation

It’s important to make sure that Google can find more pages of your site. This isn’t usually a concern for smaller sites, but sites with hundreds of thousands or millions of pages need to put some effort into making sure Google can crawl all of them, and include them in the index.

Modern Google is incredibly good at crawling and indexing sites, and it usually requires a pretty boneheaded move somewhere for this to be a major issue.


Topicality is making sure that a given page on your site matches what a user is searching for: at the most basic level, it’s making sure that the words someone is searching for are on the page, so that they are included in the database built by the Indexer. In the SEO world this is often referred to as relevance, but I prefer to use topicality because that’s the term Google uses (to Google relevance is the combination of all factors).

Topicality signals are more than just including keywords on the page, though that is certainly a significant part of it. Topicality centers around words, word count, and unique, quality content.


In the web today there are millions of pages that are topically relevant to just about any search query, and topicality signals are particularly easy to game. Moving beyond topicality to look at authority signals was Google’s huge contribution to changing the way search engines worked.

Authority signals pretty much mean links: a link from another site to a page on your site is an indication that the other site considers your page an authority on that topic. Authority signals are far more important than topicality signals, and more difficult to get. Work on authority can also involve the internal site structure, and how you flow authority through the site.

User Metrics

Google increasingly measures user behavior to determine the best sites to rank. This includes everything from the click through rate in the SERPs and how people are searching for your brand name to how long search users spend on your site.

User metrics did not used to be nearly as strong a factor as they now are, and much of this did not used to be within the purview of SEO. For modern SEO, these user metrics are vital, and cunning SEOs can realize huge rankings gains by understanding what user metrics Google pays attention to.

Risk & Penalties

Google has a myriad of algorithmic penalties in addition to a webspam team that applies manual penalties. Because computer algorithms can never be perfect, they sometimes catch innocent sites in their algorithmic penalties. As a result, in addition to playing by the rules, SEOs have to be aware of the kinds of things that can trip a penalty and take steps to avoid them. It’s not just enough to be innocent: your site also needs to look innocent. This typically involves things like reducing duplicate content and proper use of canonicalization (more on this in Tactics; Authority – On-Site).

These activities aren’t directly related to improving your traffic; instead they’re defensive moves to prevent losing traffic in the future.

Degrees of Separation

It’s worth noting that it’s often worthwhile for SEOs to pursue one of these activities through actions a step or two removed. For example: building up a large and engaged Twitter following won’t actually increase your Authority; however, when you have something awesome to share on your site, that Twitter following can help spread the word — and hopefully some bloggers or news sites discover and link to it as a result (of course data suggests that using social following to generate links doesn’t actually work).

Thus building that Twitter following was an authority-building activity, but was a couple steps removed (and of course a social media presence has other business benefits outside of SEO).

In general when you’re starting out you want to focus on activities that directly relate to one of the 5 things SEOs do, and possibly some things with one degree of separation.

Unless you really know what you’re doing, only start working on things several degrees away, like growing your Twitter following, once you have your basics in place. Otherwise you’re likely spending larger amounts of effort on things that will see disproportionately smaller returns.

<< Previous
Google’s Goals & Philosophies
All SEO Lessons Next >>
KPIs for SEO

Google’s Goals & Philosophies

The overarching goals and philosophies of Google have shifted significantly over time, but how they approach those goals remain fairly consistent.

In the early days of Google, the search engine became beloved of tech savvy people both for delivering relevant results and for their overall philosophies. Their motto of “Don’t be Evil” resonated with their users and their philosophy for their search results was that they tried to get users to leave their site as quickly as possible. After all, the sooner they can deliver a good result, the better the user experience.

In recent years many former Google fans have noted a fairly drastic departure from their founding philosophies. Google’s shopping results are paid inclusion only — something the Google founders once explicitly said was inherently evil and would always lead to biased results that were bad for users. Google is monetizing their results more aggressively than ever and leverages their monopoly to try to force users into their other products, like Google+,

Google’s philosophy now seems to be to make as much money as possible, and to actually try to keep users on a Google site for a long as possible. They really want users to either click on a paid result, or click on a result that leads to another Google property where they have another chance to deliver ads. They appear to have shifted the bar to optimize a bit more for profit and a bit less for user experience metrics (getting people away from Google as quickly as possible).

Ultimately Google is a private company and they can do what they want with their results – though at least in the EU regulators are starting to clamp down on them — and our job as SEOs is to react to how they change.

But regardless of what Google is optimizing for, the way that they do it remains pretty much the same. Google runs thousands of tests each year. Some of these tests are for changes that are virtually invisible to the average searcher, and some changes are substantial.

Google A/B tests their results constantly. They take a very small percentage of searchers and show them a different version of Google, which is the change that they’re testing. If that test group responds to the new version in a way Google likes better, that version is deployed. Google is running many of these tests every day.

How they measure success for a test is open for debate. We know they used to consider getting someone to leave Google as quickly as possible, and for as long as possible, to be a success. A lot of changes in the last few years suggest that’s no longer the number one metric for success. If a user clicks on a result lower in the SERPs that was generally considered a bad thing, and likely still is. That click on result number 7 or 8 implies that the first few results weren’t the best ones possible. This metric is almost certainly one that Google still tracks. Google also constantly evaluates their ad listings in a similar way: they are continually trying to optimize their ads to show the best and most relevant ads to users — because after all, they’re a business and they only make money when someone clicks an ad.

The point of this section isn’t to suggest that Google is out to get you or to do away with organic results — they’re not. Their success hinges on delivering good-enough search results (they no longer need to deliver the best) and so they will always have to do a pretty good job of showing organic results.

But on the other hand, Google isn’t your friend either. If you do something wrong in your SEO pursuits and your site suddenly fails to rank in organic results at all, I can promise you that no one at Google is going to take your call. There’s no customer service email either — no customer service at all for organic search, which is after all a free service. For SEO you’re on your own in a constantly changing landscape.

Fun story: I once worked for a company that spent tens of millions of dollars a month on Google ads. We were invited to Google HQ regularly and had access to the top guys on the ads team. When we had an issue with something on the organic search side the company applied as much pressure as possible, including emails back and forth between VPs and the CEO and the top Google ads guys. This resulted in absolutely nothing (as it shouldn’t of course — the wall between the Google ad teams and the search teams is very real).

Google is not quite as bad as I’m making it sound, but I do want to stress that if you engage in risky tactics, or even just look similar to a risky tactic to the algorithm, you could quickly find yourself up the creek. And once there, no one at Google is going to give you a paddle.

<< Previous
Query Classification: Do, Know, Go
All SEO Lessons Next >>
The 5 Things SEOs Do

Query Classification: Do, Know, Go


Google tries to classify search queries into one of three categories: Do, Know, and Go. The type of classification determines the type of web pages that Google tries to deliver as results. Understanding this classification system can help you understand why Google delivers the kinds of pages that it does.

  • Do: The users wants to do something, like buy a product or book a flight. Ecommerce is mostly about Do queries.
  • Know: An informational query, where the user wants to learn about a subject. Very often single-word queries are classified at least partially as Know queries, while the plural version is considered a Do query. Wikipedia almost always ranks toward the top of the results for Know queries, and Google is increasingly trying to show Knowledge Graph entries for Know queries.
  • Go: Also known as a navigational query, the user wants to go to a specific site. Someone searching for “Zappos” or “Trello” almost certainly is just looking for the link to that website.

Not every query is only a Do, Know, or Go query. Often Google thinks there could be different intents behind a single query. For example, a user searching for “Amazon” probably wants to go to, but they might be looking for information about the tropical region in South America. So it could be a Go, or a Know query.

Google solves for these kinds of queries by delivering its best guess as the first result or two, then interspersing some results from other possible query intents. This is one of the reasons that Wikipedia is one of the most often ranked websites on the web: Google seems to like covering its bases by returning at least one Know result for many searches, and Wikipedia is a massively authoritative Know type results.

That said, if you’re not Wikipedia, it’s incredibly difficult to rank well for a term with different intent classification than your site. If you write an in-depth article about Zappos, you will have a very difficult time ranking for a search for “Zappos”, because that has such a strong Go intent. Similarly if you are an ecommerce store you’ll have a harder time ranking for reviews keywords, because those have a strong Know intent.

It can be done, mind you, especially in low competition areas, but it’s far more difficult and you’ll often have poor user metrics — bounce rate and time on site — because most people searching for that phrase are looking for something different. In most cases you’re better off trying to go after search terms that correctly classify to your site.

Another way of saying that is: it’s very difficult to rank for keywords for which your site isn’t a good result. You may think your ecommerce store has awesome user reviews that make it a good result for “tablet reviews”, but in actuality users are better served with something like CNET that provides expert editorial and educational reviews. And Google knows that from user metrics, which we’ll be discussing at length when we get to ranking factors.

<< Previous
Ranking #1 Impact on Conversion Rate
All SEO Lessons Next >>
Google’s Goals & Philosophies

Localization & Personalization


In addition to all the SERP elements that we discussed already, Google further modifies the SERPs based on localization and personalization indicators.

For localization, Google looks at the IP address of your computer to get an idea of where you are (or if you’re using your phone, Google has much more precise data about exactly where you are). When you search for certain terms that Google thinks have local intent (like “pizza shops” or “dentists”) it tries to show you results within your area. You can also manually set your location in Google’s Search Settings. Because of this localization a user searching in one city or state will not always see the same results as someone searching from a different city or state.

In addition to localization, Google also tries to deliver personalized search results as much as possible. For personalized results Google mostly relies on your search history. If you click through to certain sites and spend some time there, Google will make a note of it and in the future might deliver those sites higher in the results than they would otherwise deserve to be.

I’ve had many panicked clients think their competitors were passing them by only because they had spent time on the competitor’s site and as a result Google started showing them those sites higher in the rankings. We’ll discuss all the ways you can get non-personalized results in Rank Tracking.

In addition to your search history, Google also personalizes search results based on +1s. If you click the “+1” button on a site, Google makes a note that you like that site and might deliver that site higher in your future searches. Similarly if people you have in your Circles on Google+ (Google’s social network) it might rank sites that they have +1’d more highly, or shared on Google+, in your results.

In the world of SEO, we pay attention to non-personalized rankings, but we also have to be aware of the impact of personalization. In short: the better experience that users have on your site the more likely they are to stay a while, which will give you a boost in their future searches.

This all comes back to the concept of whether you deserve to rank: the better your site, the more easily you will be able to rank well, because you’ll be able to accumulate the various ranking factors more easily.

<< Previous
All SEO Lessons Next >>
SERP Click Distribution

Ranking #1 Impact on Conversion Rate


It turns out there’s more to ranking well than just the extra traffic that it brings. At Wayfair we did some research into how traffic converts from top ranking compared to lower rankings, and we learned that traffic from the #1 ranking position converted much better than traffic from lower positions.

We wrote the study up at Search Engine Land, but here are the key points that are worth knowing:

Impact of ranking #1 on conversion rates

We saw about a 34% improvement in overall conversion rate from organic traffic for moving a keyword from ranking in positions 3 through 6 up to ranking #1. That’s in addition to the extra traffic we got from ranking better.

Interestingly, while it looked like moving up to rank 3 or rank 2 probably improved conversion, the data was muddy and not statistically significant. It was only on hitting #1 that we saw the massive spike in conversion rate.

Here’s what this might mean to the bottom line in terms of real dollars:

Dollar impact of ranking #1 over ranking #6

The likely cause of this behavior is that as soon as someone isn’t clicking on the #1 organic result, that probably means that either they’re looking for something very specific, or they’ve already clicked on #1 (and 2 and 3) and are price comparing or just shopping around without actually intending to buy.

In other words, if someone is shopping for USB Drives they’ll click the first result: if it’s a good site with the products they want at a reasonable price, they’ll probably just buy. This suggests that a lot of the clicks on results below #1 are actually people who have already clicked on the #1 result and continued to check out what everyone else had to offer.

So the lesson here is that ranking #1 really matters. It’s not just about the extra traffic, it’s also about better traffic.

<< Previous
SERP Click Distribution
All SEO Lessons Next >>
Query Classification: Do, Know, Go

SERP Click Distribution: How Many People Click on #1

We know that ranking better in the search results is going to get us more traffic, because more people will see and click on our result. But how many more?

The question here is: what is the distribution of clicks in the search results?

The best correct answer is that it depends. Every SERP behaves differently. What the paid ad landscape looks like has a huge impact on how many click there are left over for organic (which is usually still the large majority of clicks). Google Shopping results, for example, tend to attract a lot of clicks, and if shopping results are present you’ll see much different organic click distributions than if they’re not. And paid elements like that tend to come and go and switch position from day to day as Google is constantly testing and optimizing themselves.

But even within SERPs that have the same ad elements the click distributions can vary wildly. For some SERPs the users seem content to just look at what the first result has to say, while other SERPs seem to have users that scour the titles and descriptions of each listing before making their choice.

As long as we understand well that every SERP is different, we can start talking about what the average SERP click distribution looks like.

For the sake of keeping things simple, here is a good estimate of overall average click through rates on organic results:

Rank  1: 30%
Rank 2: 15%
Rank 3: 10%
Rank 4: 6%
Rank 5: 4.5%
Rank 6: 3%
Rank 7: 2.5%
Rank 8: 2%
Rank 9: 1.2%
Rank 10: 1.5%

This estimate is a combination of a lot of experience, various studies, and examining tons of results in Hitwise and Google’s Search Console.

Note, however, this average includes branded searches. So in addition to people searching for “tent poles” and choosing a result to click, we have people searching for “trip advisor” — obviously those people are incredibly likely to click on the Trip Advisor result, which throws the average for the top positions way off. So you’ll rarely see a keyword with this average click distribution.

Instead, here is a pretty common click distribution for commercial transactional keywords (searches like “gardening gloves” or “guitar strings”) that generally include Google shopping results as well as the normal paid ads:

Rank 1: 13%
Rank 2: 10%
Rank 3: 8%
Rank 4: 7%
Rank 5: 5.5%
Rank 6: 4.5%
Rank 7: 3.5%
Rank 8: 3%
Rank 9: 3.2%
Rank 10: 3.3%

Even on these results with Google shopping showing up in addition to ads, organic clicks on the first page are still gathering over 60% of all the clicks.

Note that on mobile we see a slightly different click distribution in the SERPs, with the #1 rank gathering closer to 16% of clicks, and #2 slightly inflated as well. However, the difference isn’t as large as what you might expect given how most of the mobile screen is filled with ads.

<< Previous
Localization & Personalization
All SEO Lessons Next >>
Ranking #1 Impact on Conversion Rate

The Google SERPs

In the SEO world, a SERP is a Search Engine Results Page. It is the page of links that Google returns when you search for something.

Back in the 90s Google’s SERPs were a list of 10 links, with perhaps some ads showing up at the top or on the right side. Since then Google’s SERPs have grown much more complicated, with different independent segments integrated into the organic results, and more paid placement than ever.

Here is a fairly basic SERP:

Example of Google Search Results Page

Note that everything outlined in red is a paid result. Even the Google shopping results are paid inclusion only — you can’t be there without paying for the privilege. Google is becoming increasingly aggressive about monetizing the SERPs and promoting their own properties within the results.

The good news is that despite this newfound focus on money, most people still click the organic results rather than the paid results (and lots of tests suggest that most people who click on a paid result don’t even realize that they’re clicking on a paid ad).

On average, from 50% to 80% of the clicks in SERPs are on organic results. It varies a lot depending on the search and what kind of paid elements are included: Google’s shopping results attract more clicks; SERPs with four ads on top attract more than SERPs with three. SERPs with only one result on top commonly see over 95% of clicks on organic.

Since paid ads make a big difference in click through rates, let’s take a look at some different ways that Google monetizes SERPs:

Example of Google flight search results

Those flight results below the ads is not Google being helpful and trying to find you the best flight — that is also paid inclusion (notice the gray “sponsored” in the upper right). Everything on this SERP is a paid ad. Google is integrating more and more verticals into the results in this way.

Now let’s take a look at how different organic elements get integrated into the SERPs.

Example of Google dentist search results

Here we’re looking at a search that Google determined has local intent. This can be triggered in a number of ways:

  • If the search includes the name of a city or state, that’s a good indicator to Google that they should return local results.
  • If the search is made from a mobile device, like a smartphone or tablet, Google may take the GPS data from the phone to determine what should be in the search results.
  • Some searches Google has just learned over time are likely to have local intent — looking for restaurants or, in this screenshot, dentists. In these cases Google gives results based on how your location is set in your search settings to deliver local results.

That pack of local listings beneath the ads are all businesses that have Google+ Local Pages, and are tied to the little flags on the map. That local pack is not paid, but it does require the businesses to create a Google+ Local Page account and fill out their info. The local pack is inserted into the organic results, pushing most of them down the page. Ranking within the local pack is an entirely different process than ranking organically (explained more in the Local SEO section).

A similar organic element, though less intrusive, is Google news. A news result might look like this:

Example of Google News search result

Here Google has determined that the search query is likely to be looking for a news result — which must rely on freshness (how recent it is) much more that overall accumulation of links. So Google has inserted the news block into the rest of the organic results.

Here is another search result page that includes image results:

Example of Google results with an image block

Google has determined that some percentage of people searching for this query are likely to want to see image results, and so they’ve shoved a big block of images right in the middle of the organic results. Once again, these are not paid, but images are ranked somewhat differently that web pages (explained further in Image SEO).

Finally, here is a SERP with a Knowledge Graph:

Example of the Knowledge Graph in Google search results

The Knowledge Graph is Google’s effort to become more relevant by delivering answers directly in the search results, rather than directing users to a site to find the answer. The Knowledge Graph is mostly populated off of a few Creative Commons sources like Wikipedia, but Google is increasingly integrating sources from other sites. This has led some webmasters to complain bitterly that Google is essentially scraping their site and delivering their content directly — and of course then Google gets to show its ads on that content (the SERPs) rather than the webmaster. In fact Google has begun experimenting with inserting paid links directly into the Knowledge Graph to better monetize that space.

Google’s response to complaints that it’s stealing data, as always, is that the webmaster has the option of blocking Google from his site, which of course prevents the site from ever showing up in Google’s search results.

In point of fact, Google started doing this long before the Knowledge Graph with direct answers at the top of the SERPs. Once upon a time there were websites who enjoyed decent ad revenue by giving conversions and simple subject facts, but the days of sites like that are numbered, if not already over. Google wants deliver simple factoids itself now.

Here are some shots of Google answers SERPS:

Example of Google answers directly in the search results

Example of Google converting currency in the search results

Example of Google giving answers directly in the search results

The point of this section is to understand that there are a lot of different pieces to the SERPs, even within the organic results.

Google usually shows 10 organic results, plus the paid options, including any additional elements (local pack, images, video, news, etc.). For certain searches, however, Google will now only show 7 results. These are typically searches that Google thinks have strong navigational, or brand intent. For example, if you search online for Zappos, Google will deliver only 7 results, because it’s pretty sure what you’re looking for.

In the world of SEO, when we talk about a site’s ranking position, we’re talking about the raw organic ranking position, ignoring elements like local, news, images, etc, and ignoring paid listings.

Keep in mind that if the local pack appears before your organic result, that’s going to absolutely destroy the amount of traffic you get. But to make fair comparison and because there’s nothing you can do about where Google chooses to insert additional organic blocks, we don’t include them in our position reporting.

We absolutely must, however, be aware of them and how they impact our potential opportunity (a search result with local results, or news, or other elements on top may not be worth the investment to rank #1, when there isn’t a whole lot of traffic from that #1 ranking).

<< Previous
How Search Engines Work
All SEO Lessons Next >>
Localization & Personalization

How Search Engines Work


Here is the short version of the actual mechanics of a search engine. This falls in the category of a good thing to know if you’re working on SEO; however, in practice this information is not necessary. You can know how to rank a site well in a search engine without understanding how they crawl and index the web.


Google has an automated program, called Googlebot, which crawls the web. Googlebot has a list of every URL that it knows of and it regularly checks back with each of those URLs. For each URL on the list, it downloads the entire page from the server and passes that along to the Google Indexer. Then Googlebot looks over the page for links and adds any new links to its list of URLs to visit.

For URLs that it has discovered before, Googlebot has a schedule of how often to revisit the page. In general, the more links there are to a page, and the more often the page changes, the more frequently Googlebot will visit the page. I’ve had blogs that received a constant stream of comments where Googlebot visited every five minutes to check for updates. I’ve worked on other sites that Googlebot will visit most pages just once a month. The decision is made on a page by page basis: so one page on a site could get visited every day, while another is visited only once a month.

In general you can expect most pages to be visited at least once every 30 days. If the page is substantially different when Googlebot visits again, it will make a note of that and maybe consider visiting sooner next time. If the page is mostly the same as the previous crawl, Googlebot might consider waiting longer before the next crawl (though 30 days seems a common cap even for pages that never change, as long as they have decent links pointing to them). In this way Google tries to find the right balance between conserving crawling resources, while still having the most up to date index.

As an interesting aside, Google deliberately slows down Googlebot and prevents it from crawling pages as fast as it can, because it doesn’t want to stress website servers. In talking about their crawling capacity, a Google engineer once mentioned in passing that they probably have the ability to take down the entire internet if they really unleashed their crawler. Google is filled with simultaneously fun and terrifying facts like that.


Once Googlebot has downloaded a page, it passes that along to the Indexer. The Indexer looks through the page and makes a note of just about all the words of text it finds (converting them all to lowercase while it goes through).

These words all get stored in the index database: each word could be a keyword in a search, so the database stores every word of the document (except stop works like and, of, or, the, etc.) with a note of what page it was found on, and where in the page it was found.

Thus the database of the Indexer includes a massive list of every word there is (and many that aren’t, given the number of typos on the web). Each word is associated with a list of every web page on the internet that includes that word. This is a really, really big database.

Invisible Elements

It’s worth noting here that Google’s indexer actually sees the code of a webpage, and not necessarily what users see on the screen. This means that in some ways Google can see a lot more than users; however, there are certain kinds of web features that Google’s Indexer doesn’t see at all or has a lot of trouble seeing: Flash content and JavaScript are the most common culprits.

It is possible to build an entire interactive website in flash, but the only thing that shows up in the code is a call to a single file. Sites like these have no chance of ranking, because they literally don’t have a single word to be stored in the Indexer database.

Images have a similar problem. Either through a desire to have more control over their graphic design or through simple ignorance, many sites use image files to display text on their site. Rather than actually having HTML with CSS formatted text, they type into a graphics program and save this as an image file. The Google Indexer can see that you have an image there, and it can see what the image name is, but has a very limited idea what kind of text is on the image. Thus any text on that image doesn’t get included in its database. Google has made some progress on this front, and now has some ability to actually read text on images, but this appears not to carry the same weight as text on the page – it’s not clear if this text makes it into the database, or is used elsewhere in the algorithms (possibly only for image search). For all practical SEO purposes, text on images does not exist.

To get an idea of what Google’s Indexer sees, you can directly view the source code of any webpage. Depending on your browser, you can view the source code of a page by pressing Control-U, by clicking View > View Source, or by right-clicking on a page and selecting View Source.

Alternatively you can use a text-based browser like Lynx (or the online version, Lynx Viewer).

Query Processor

The final part of the search engine is where the magic happens. A user enters a search query into the search box, then Google’s Query Processor goes to the index to retrieve every page that seems to match the query. Then some magic happens where Google decides what order to rank the results.

It’s not magic of course, and it’s not even unknown what happens at this stage. Google says they use over 200 ranking factors to determine how to rank search results. We’ll discuss what the major factors are later. In fact, this entire site is really mostly an explanation of what the “magic happens” part of Google really is.

The truly impressive thing about the Query Processor is its speed. Using only normal desktop computers (but lots and lots and lots of them) Google is able to take your query, check the index, get a list of millions of pages that include the same words, figure out which are probably best for you, and rank every one of the millions of results in order. And Google does this in a fraction of a second.

Google is obsessed with speed, by the way. Early experiments at Google showed them that delays of even a fraction of a second resulted in users making fewer searches. Google believes that speed is crucial to a good user experience — and it thinks that true of your site as well.

<< Previous
The Search Engines
All SEO Lessons Next >>