Link Tactics to Avoid Because They Will Get You Penalized

The reason we want to avoid certain link building practices is because eventually Google could discover what we’re doing — either algorithmically or due to a manual review — and penalize the site, completely removing our ability to rank. Every year we see more instances of competitors or random bloggers outing companies for their shady link building practices, and someone at Google looks into it and penalizes the site.

It is simply not worth the risk of engaging in risky, spammy, or forbidden link acquisition tactics, unless your strategy is to try to rank a site for a few months or weeks and then throw it away.

The following tactics are all against the rules for one reason or another:

  • Do not pay/compensate for a link: you cannot compensate a blogger or site owner or article author in any way for a link (you can pay for nofollow links that don’t help you rank, of course). This includes any form of compensation: you cannot offer money for links (JC Penny was famously penalized for this). You cannot give away a free product for a link in a “review.” You cannot offer a discount in exchange for a link (Overstock.com was penalized for this). You cannot offer to tweet their post in exchange for a link (RapGenius.com was penalized for this). You literally cannot compensate a site owner or author in any way for a link, because the whole point of links is that they’re supposed to be legitimate signals that people find your site valuable, without bribery involved.
  • Avoid Directory Links: there are hundreds of thousands of link directories where you can place a link to your site for free, or for a nominal charge. There are services that will get you listed in hundreds or thousands of directories for surprisingly little. These are bad links that could hurt you. That said, there are some legitimate industry-specific directories where getting a link is fine. These are easy to identify: they are directories that people actually use and the kind of place that you’d be willing to pay for a listing even if the link didn’t help your SEO. In general, directories are not a valid part of your link acquisition strategy.
  • Avoid Social Bookmarking: At some point some SEOs discovered that various social sites (often forums) did not nofollow links within profiles. Next thing you know there are services offering hundreds of social bookmark links for one low price. While a great automated method, these links never helped much and can hurt you now. Feel free to put a link to your homepage in your own profile on sites you actually use regularly, but social bookmarking is not a valid link acquisition strategy. This includes things like Squidoo lenses or Tumblr linking.
  • Don’t Do Article Marketing Sites: There are sites where you can submit an article — on any topic at all — and they will host it and link back to you. This was a big automated link acquisition tactic in the early 2000s. It rarely worked back then and it will hurt you now.
  • Don’t Use Press Releases for Links: According to Google all links in press releases should be nofollow. This wasn’t entirely practical since tons of large businesses use press releases for legitimate purposes, so Google followed up by just not counting any links from press release sites and removed the ability of most press release sites to rank as well. There are thousands of sites that host press releases, and most of them exist for the same reasons as the article marketing sites – automated acquisition of hundreds and thousands of links. Press releases are not a valid part of a legitimate link acquisition strategy.
  • Don’t Guest Blog (many exceptions): Guest blogging was a big strategy in the early 2010s that Google finally shut down. The idea was you went to a blogger and offered to write a post on their site about a relevant topic. You got a link back either in the byline, or by embedding that link into the article. This was considered a legitimate tactic at first, but then lazy SEOs began automating the process (notice a trend here?): they’d outsource the writing, then outsource even the placement. Blogger email was filled with millions of spam emails in broken English asking to guest post. Google finally shut it down by declaring guest posting to be a spammy tactic subject to penalty. Don’t get me wrong: if someone comes to you and asks you to make a post on their blog, by all means do so. If Huffington Post or Ars Technica or the New York Times is going to post your article, that’s almost definitely fine. There are very legitimate guest posting opportunities: they’re the ones at sites everyone has heard of and that are hard to get into. But you should have at most a small fraction of your links from guest blogging.
  • Widgets – Use Caution: in the early 2000s widgets were a killer link building tactic. You built a neat little widget that someone could install on their own blog — perhaps an interactive calendar or a quiz to see which Star Wars character you were. When the user put the code on their site, it included a credit link for the widget. Eventually that tactic became hugely manipulative with anchor text links pointing to sites entirely unrelated to the widget, or embedding a dozen hidden anchor text links, as Houzz famously did before being penalized (for that and other shady linkbuilding moves). Google started cracking down on widget links, first removing link value, and even penalizing. That said, Google has made it clear that you can produce a widget that has a credit link, provided that link goes to a legitimate page (it should go to the page about the widget where a user can get the widget code) and does not use keyword-rich anchor text. The anchor text should be the name of your site. So this tactic is valid in small doses if done very carefully.
  • Avoid Anything Automated: You’ll notice that the one common thread running through most of these practices that can now get you penalized is automation. (You could well argue that another common thread is that the site did not earn the links). Any time you are offered an automated way to get a lot of links for a relatively cheap price, you can be certain it’s either against the rules and could get you penalized, or it will become against the rules in the near future.

Okay, so that is the definitely bad list. You should not do these things. They are risky and they are against the rules.

Off-Site Authority Intro

Off-site authority building can be summed up simply: getting links from other sites that point to your site. As many links as you can get from as high quality sites as you can get.

This is without question the most difficult part of SEO, where the real SEO work happens. Authority is by far the largest ranking factor (though user metrics are quickly rising to a similar level), and the off-site authority building that you do is what will determine whether your site ranks or not.

Unsurprisingly, it’s also going to eat up more words on this site than any other section.

Links have been at the heart of Google’s ranking algorithm from the day it launched, and every single year it has grown harder to rank for the simple reason that every year there are more sites competing for rankings, and the established sites have had an extra year to gain more links — so new sites need increasingly more (or better) links to have a shot at competing.

Because authority is so incredibly important to ranking, many SEOs have looked for and found shortcuts and automations to get links. Every year Google gets more and more aggressive about shutting down these shady and black-hat tactics. The good news is the legitimate ways sites have always used to earn links continue to work year after year. The bad news is those ways are work — that’s why so many SEOs spend so much time looking for shortcuts.

Before we start diving into the actual tactics of attracting links to your site, it’s worth taking some time to address what not to do — because you will be confronted with many people offering to sell you shortcuts that could get you penalized. We need to understand what Google expects your backlink profile to look to help identify what is a legitimate link-building method and what is against the rules.

Let’s start with what you should unequivocally not be doing.

Reuse URLs to Preserve Authority

As a final note on the topic of on-site authority, whenever possible you want to re-use existing URLs that already have links to them, rather than building new versions.

For example, if you have an article on the best vacation spots of 2013 that got some links and attention — reuse that page when you do the 2014 article! Rather than making a whole new page and starting over from scratch, just replace the contents with the 2014 content. Then your new article is already starting out with a bunch of links pointing to it and a history of good user metrics, making it more likely to rank.

If you really want to preserve the old content, make a new article for the old stuff and backdate it!

Obviously this technique only works if the old content is no longer attracting visits and links. This is commonly used in the ecommerce industry for Black Friday sales: sites like Amazon.com and Target.com don’t make a brand new URL every year — they use the exact same Black Friday URL year after year so that it can continue to accumulate more and more and more links (which makes it increasingly difficult for anyone just entering the game to compete for rankings).

You should think ahead when using this technique and make sure that your URL does not include date info. Amazon surely doesn’t want its Black Friday page to live at amazon.com/black-friday-2008, for example.

Nofollow Links & When to Use Them

We’ve mentioned nofollow links before in the authority overview, and it’s about time to explain them in more detail. Nofollow is just an attribute you can give to any link you create to prevent that link from passing any authority or ranking factors. The html for a normal link looks like this:

<a href=”http://www.domain.com/awesome.html”>Click Here</a>

That is the standard, simple, default link format that will pass authority and help the target page rank better. The text “Click Here” will be the highlighted link. A nofollow link looks like this:

<a href=”http://www.domain.com/awesome.html” rel=”nofollow”>Click Here</a>

The nofollow link will not pass any authority signals and will not directly help the target page rank better. Despite the name, however, Google will follow nofollow links — Googlebot often crawls these links, but makes a note that they don’t count for ranking purposes.

Many webmasters and SEOs use this attribute incorrectly. As we learned in PageRank, the PageRank that flows out from a page is divided by the number of links. Some SEOs still use nofollow to try to concentrate more PageRank into the other links: this does not work. Some SEOs think that Google won’t follow a nofollow link: this is not true. Google does follow them.

There are basically only three kinds of links you will want to make nofollow:

  1. Any kind of paid link, or link you were compensated for in any way, should be nofollowed. It’s against Google rules to pay for a link that passes PageRank, and a site that buys or sells links can get severely penalized by Google. Thus all ads should be nofollowed.
  2. User Generated Content (UGC) links should be nofollowed. This means anyplace where a user can write something on your page — like forums, comments, reviews — any link they include should automatically be nofollowed. You have no control over where they might link and you want to distance your site from those links, particularly if they link to spammy sites. Further, keeping them nofollow will somewhat reduce the amount of spam you get (from all the SEOs trying to use your forums/comments to build links to their own sites).
  3. Any links to sites you really, really don’t want to help. If for some reason you are linking out to a site that you are competing with for rankings (perhaps you sell information on Black Mold and want to link to the Wikipedia page but don’t want to help it outrank you) you will want to nofollow that link.

For the most case, other than ads and UGC, none of your links should be nofollow. Links to your privacy terms or to your social media account do not need to be nofollow, and it doesn’t help you if they are.

The Key to Understanding Nofollow Links

The core principle to remember with nofollow links is that making a link nofollow does not help your site in any way. All it does is prevent that link from helping another site. But your site will flow the same amount of PageRank and be crawled just the same regardless of what you do with nofollow.

Pagination

Most sites have some form of pagination in them. Blogs show the most recent articles, then you can click through pagination for lists of older articles. The product listing pages of ecommerce stores have pagination to go through the entire list of products in any category.

Google used to have problems with pagination and sometimes it would rank, for example, page 3 instead of the first page. This is bad because you usually have the first page of the series optimized to be the best user experience.

Google created rel=prev and rel=next tags to better understand pagination. With these tags Google understands when its seeing a paginated series and it knows what the first page of the series is.

For very large sites, pagination can be somewhat important: for small sites it’s usually not important at all unless you know Google is ranking the wrong page of a paginated series.

Here’s how you can implement pagination:

In the <head> section of the page code, include the following for each paginated page:

<link rel=”next” href=”http://www.yourdomain.com/the-next-url.html” />
<link rel=”prev” href=”http://www.yourdomain.com/the-previous-url.html” />

The “next” URL should be the next page in the series: so if you’re currently on page 2, the “next” will be the URL for page 3. Similarly the “prev” URL should be the previous page in the series: so if you’re currently on page 2, the “prev” will be the URL for page 1.

If you’re on the first page, you don’t need to include the “prev” line, and if you’re on the last page you don’t need to include the “next” line.

There is alternate markup that you can use if you have a View All page in addition to your Pagination. I generally don’t recommend using this, because if you have a View All page, that is the one that Google will want to return in search results, and the View All page is usually not the best user experience (if it was, you’d just use that and not have any paginated results, after all).

If you really want to have both pagination and a View All and serve the View All in the search results, instructions for that can be found here.

Duplicate Content & Rel=Canonical

Another way to control the authority flow within your site that is specific to duplicate content is rel=canonical. This code is used when your site has multiple pages that are nearly identical. This happens in almost every site of any size and isn’t necessarily a bad thing. Some site features that create duplicate content include:

  • On ecommerce sites anything that lets you change the order in which products are displayed usually creates duplicate content. If you let the user sort by best-selling, or by price (and you should) that will usually create a variant URL, but the content of the page is identical — just in a different order
  • On blogs your archive pages usually create duplicate content. For example when you look under categories, or tags, you’re getting a list of the same blog posts that exist elsewhere.
  • Ecommerce sites that use the category structure in the URL create duplicate product pages when there are different paths to navigate to a product. You might access a product both at site.com/blue-shoes/awesome-sneakers and site.com/mens-shoes/awesome-sneakers for example

Having internal duplicate content isn’t necessarily a bad thing. Google has even said that 25% of all the pages on the internet are duplicate pages; it’s a part of site design that makes navigation better for users and Google has learned how to deal with it.

The problem for SEOs is controlling which page ranks, and the greater issue of dilution of authority.

When Google sees two or more pages on your site that are mostly duplicates of each other (they can be slightly different and still be duplicate content) Google will choose just one of those pages to rank, and that might not be the page you want it to be.

Furthermore, perhaps some people link to one URL of your content and other people link to the other URL. Maybe you have 20 links to this great page you’ve created, but there are only 10 links to each version. Now that page only has half the links it should and isn’t ranking nearly as highly.

Enter Rel=Canonical

This is where rel=canonical comes in. This tag sits in the <head> portion of your site code and tells Google which version of a page is the canonical version — which is the official version that Google should rank. Every version should have the rel=canonical tag and they should all point to the same official URL. Then any links to any version of the page counts as if it was going to the canonical version (technically you still lose 15% of the link juice, so 85% of the authority passes).

Here’s what rel=canonical should look like, somewhere between the <head> and </head> tags:

<link rel=”canonical” href=”http://www.domain.com/canonical-url.html” />

As a best practice, every page of your site that you let Google index should have the rel=canonical tag pointing to the official version. The reason is there are a lot of ways for people to link to you that changes the URL: you could get links with a refid or UTM code (which are parameters that are used for tracking purposes) and most large sites have multiple ways to render a URL and even the engineers who work on the site code aren’t familiar with all of them.

What you have to be careful to avoid, however, is having duplicate content that each canonical to different sources. This won’t hurt you or penalize you, just Google will then decide it needs to ignore the canonical and once again makes its own decision about what to rank.

You can also point the canonical tag to an entirely different site (or subdomain). If you have two different websites, but have certain content that is identical on both sites (duplicate articles, guides, products, etc.) you can choose which site should be the canonical version.

In most cases, canonical implementation is something that you’ll need to talk to your webmaster about. They can either code them to dynamically generate based on internal logic, or they can code them so that you can set the canonical version on each page of your CMS yourself. If you have a WordPress site, you don’t need to worry about canonical tags, because WordPress takes care of that for you.

Rel=canonical is a very useful tool for making sure the correct page is ranking, and ensuring that it ranks as best as it can and it’s an important SEO best practice that should be put in place when your site is created — and as soon as possible if your site was built without it.

Testing SEO

Any good SEO will inevitably spend a decent amount of time conducting SEO tests. Some of these tests are pretty much universal: does Google count internal anchor text (no), do you need text on your page (yes).

You might conduct a test with something silly like this:

But others are very site specific: do title tags actually matter for your site in your implementation (sometimes no, usually yes). And of course sometimes we’re not trying to measure how Google works, but rather just trying to measure how much lift we see from a particular implementation — which lets us accurately estimate the ROI for putting in the work; or alternatively accurately measuring the impact that our work had.

When it comes to measuring how Google treats links, we often do it by linking to random sites with ridiculous anchor text, like stynkeetootsaplop or something similarly strange. The idea is that since nothing ranks for the word, if the target site starts ranking for it, we know Google counted the link and passed anchor text.

Other possible crazy phrases could include any of the following:

The easiest way to generate gibberish phrases for testing is to combine random strings of numbers and letter, preferably pretty long ones — 15 to 20 characters. But I much prefer trying to come up with strange phrases that almost sound like a real thing.

Because in the world of SEO, you’re gonna do a lot of testing, so you might as well have some fun with it!

SEO Hero

Most SEOs by now are aware of the SEO Hero challenge from Wix. In an effort to promote their web platform and prove that it’s good for search engine optimization, they are offering a fifty thousand dollar prize if anyone can rank #1 in Google for the phrase SEO Hero — but they will be competing too. In Doctor McAwesome fashion, I’ve built a site to compete in a slightly different way.

Basically instead of using search engine optimization for my own gain, I’m committing to donating the entire prize money to charity.

The idea here is to see if I can get the SEO community to come together around a good cause — set aside individual pursuits of profit to do some good. The basic strategy of the site is to assume goodwill on the part of SEO consultants and agencies: that they will be willing to link to the project to try to send that prize money to a good cause.

The Real Challenge

Honestly, I don’t think the real challenge is outranking all the other SEO Hero sites that are competing (part of the rules of the contest is you have to start with a brand new site, without any registration history, and thus no link history). The real challenge is going to be outranking the sites that aren’t competing, that have ranked for SEO Hero for years.

Some of these are agency sites, and currently one of the top results is SEO Round Table’s article reporting the Wix contest. Sites that have had years to build up not only their domain authority, but also their user metrics and their topical relevance to SEO are going to be tough to outrank.

In addition to a swarm of links, success is going to require driving meaningful traffic to the site (which hopefully a swarm of links will help with) as well as good searcher behavior. i have some ideas to help with user metrics — but for now my main focus is just going to be to build up a stable of links and establish some authority.

If the site gains any traction at all for SEO Hero searches, and makes it in striking distance, it will be very interesting to see how a brand new site with a few months of authority and user signals can compete in Google against sites with years of history.

if nothing else, the competition will provide a very interesting testing ground!

Dead Pages and 301 Redirects

Sometimes you have to take pages off your site. Perhaps a product or service is being discontinued, perhaps it’s outdated information that no longer belongs, or perhaps the page was just a bad idea in the first place.

Discontinuing pages is perfectly normal and happens with every site.

Where we run into trouble is when there were links from other sites pointing at those pages. Now when Googlebot crawls those links it encounters a 404 error rather than the page it was expecting to find. When that happens you do not get credit for that link — after all, the page the site was recommending isn’t there anymore, so the recommendation goes away.

Over time, discontinuing pages can having a cumulatively negative impact on SEO. You are in effect throwing out links, which are by far the most valuable and hard to gain ranking factor. Over time you’re trying to build your SEO on a pile of bones.

Happily, there’s a solution: the 301 redirect.

301 Redirects

A 301 redirect is a permanent redirect that happens at the server level (not on the HTML code of your site). When someone tries to go to the dead page, instead of getting a 404 Not Found error, then instead automatically switch to a different page (on a different URL) and load that instead.

PageRank flows through a 301 redirect, moving from the dead page to the new one.

Any way of redirecting a page that is not a 301 redirect will not pass full SEO benefit. This includes 302 (temporary) redirects and meta refresh code in the head of a page. (Some Googlers have made statements suggesting that 302s are as good as 301s: tests have proven that this is not true).

For the most part, you will need to talk to your webmaster about setting up 301 redirects. If you are using WordPress there are plugins that will let you implement them pretty easily, and odds are your hosting service probably has some kind of tool to make redirects easy to manage.

If you are using an apache server (super common for small sites) you can manually create 301 redirects by editing the .htaccess file in your root directory (this is often hidden, so you will have to set your FTP program to display hidden files — note the period in front of the filename).

Here is the code you need to program a 301 redirect directly into .htaccess:

First enable the Apache ReWriteEngine in the mod_rewirte module and enable symantic link matching by pasting this code into your .htaccess:

Options +FollowSymLinks
RewriteEngine on<

This only has to be done once, and you can then enter all the redirects you want thereafter. To actually redirect your pages use this format:

Redirect 301 /oldpage.html http://www.domain.com/newpage.html

There are a lot of sophisticated ways to enable 301 redirects based on regular expression (RegEx) matching, but that’s a bit beyond the scope of learning SEO. This will enable you to redirect pages as needed, and if you are moving entire sites or need specific type of page matching you should talk to your software engineer.

When to Use 301 Redirects & Where to Redirect

As a general SEO best practice, any time you take down a page, you should 301 redirect that page’s URL to another page on the site. There are certainly times when it’s appropriate to deliver a 404 error; however, if there are any links to the page you’re taking down, you really want to 301 redirect it instead and preserve that authority for your site.

As for where to redirect a page: if you’re taking a page down and there is another page on the site that provides the same (or nearly the same) info, you should redirect there. This is the easy case where a product is discontinued when the newer version of the product becomes available.

When there isn’t a nice one-for-one replacement, you generally want to redirect to the page that is one level higher in the site hierarchy. For ecommerce that means redirecting product pages to the associated sub-category or category page. Sometimes it will even mean redirecting up to the home page — but if there are links pointing to that dead page you do not want to lose them.

How to Optimize PageRank / Link Juice

Just like PageRank flow through external links, from one site to another, so too does it flow through internal links within your site. I have seen truly astonishing ranking gains simply by optimizing this internal PageRank flow — though of course you have to have good external links first!

The key factor to optimizing your internal PageRank flow is to reduce the number of links on your page as much as possible, while at the same time making sure you link to your most important pages from your global navigation. By having those links in your global navigation, you’re ensuring that you’re flowing PageRank from every single page of your site to the pages that you want to rank.

This is one of the reasons that you want to have a hierarchical site structure: your most important pages (the ones you want to rank for the most important keywords) are in the top tier in your global navigation; the next most important go a tier down and are linked to from those top tier pages, etc.

There is no golden number of links that you should have on any given page. You want only as many as you need, and no more. Most sites struggle to have fewer than 100 links and something near that 200 links is about right for a lot of sites; most sites with 500 or more links per page definitely have far too many.

Sources of Link Bloat

One of the most common culprits for link bloat is the footer. This is the least visited section of your site, and it tends to accumulate all kinds of junk links that don’t really need to be there. Sure, your site might need to have an About Us and a section about your awards, and your founders, and job openings — but those do not all need to be separate links on every single page of your site. Consider instead having the same list of text but making it all a single link that leads to an About Us page that then links out to each individual section.

Global navigation, usually the navigation up at the top of the site with dropdowns, is another common culprit. Your global navigation should link to most important keyword pages (which should usually be the pages most people are looking for anyway). In a broader sense, however, your global navigation should be used for navigation. I know that sounds obvious, but this is actually a common problem area.

Navigation vs Discovery

A typical mistake that even experienced site merchandisers do is try to boost flagging pages or products by linking to them in the global nav. The hope is that more people will see the link, and then go visit the page. What they are doing is using the global navigation for discovery instead of navigation. They aren’t trying to help someone get where they want to go: instead they’re trying to interrupt someone’s trip and convince them to go somewhere else.

I once worked on one of the largest ecommerce sites on the internet that had this issue. Over time various category managers tried to boost their categories by inserting more and more links into the global navigation drop-down menus. Many of these links appeared several different places in the global navigation.

When the Taxonomist at the company finally did a comprehensive review, he found that many of these links were generating only 50 clicks per month — this in a site with millions of visits a day. And one of those 50 clicks per month links was in the global navigation five different times, while links attracting hundreds of thousands or millions of clicks were there only once.

Navigation is not the place for discovery. If left unchecked, it makes your site increasingly difficult to navigate, harming your user experience… and of course also hurting your SEO, because now your PageRank is divided by all those useless links.

At Wayfair we once removed over 150 links from the dop-down navigation as part of an authority flow optimization. We not only saw big improvements in SEO traffic, but an A/B test showed that the new navigation improved conversion rate, and reduced usage of the internal search. Showing fewer navigation options actually made it easier to navigate the site!

There’s nothing wrong with incorporating discovery into your site, but it’s usually better done through ad blocks (where you put your own attractive ads) and it should almost never be done as a global navigation item. This can also help you to really think about whether the ad needs to be on every page of the entire site, or if it’s more appropriate to only certain sections.

SEO vs Usability

Some of the hardest decisions in site navigation is the balance between SEO and usability. Ultimately you are going to have to link to pages with zero SEO benefit: you need a link to your Contact page; you need to link to your shopping cart or RFQ form, etc. But try to be disciplined about restricting links to pages without SEO value to pages that you really do need.

In the end, usability should usually win over SEO. After all, one of the key tenants is Do You Deserve to Rank? You need a great site, and that means a site designed for people, not only for Google. But good usability is like a good break-up: you got to be brutally honest about which links you really need in there.

Which Links to Obsess About

As a general rule of thumb, you should not worry about links in the actual content of your pages. Put links there when they’re useful to users and don’t worry too much about authority flow. Unless you have a truly insane number of links regularly appearing in your content — for example if you have inane rules like “link to at least five pages from the content” — focus your attention elsewhere.

You should be obsessive about two areas of links:

  • Global Navigation Links: anything in your header, footer, and left/right nav. WordPress sites will find common culprits in tag clouds, archives, theme links, credit links (also a good way for hackers to identify your site) etc.
  • Template Level Links: If you have page types that have their own sub-navigation (like facets on product listing pages, or sub-menus)

Not only should you obsess about every single link appearing in these areas (keeping in mind: user first), but you should revisit them at least every six months. I’ve found that optimizing authority flow is like a game of whack-a-mole: you knock ’em down, but pretty soon a whole bunch more links are popping up.

A Link Calculator Tool

How many links do you have on your pages? Probably a lot more than you really think. You can use the handy Link Calculator tool to quickly get a count, and list, of every link on any page. This tool was built by my cunning engineering team at Wayfair and given to the public to use.