Whatterz


Canonical URLs – What Are They All About?

by Simon. Average Reading Time: about a minute.

Google announced as long ago as February, in their official Webmaster Central Blog a new canonical URL tag:

Carpe diem on any duplicate content worries: we now support a format that allows you to publicly specify your preferred version of a URL. If your site has identical or vastly similar content that’s accessible through multiple URLs, this format provides you with more control over the URL returned in search results. It also helps to make sure that properties such as link popularity are consolidated to your preferred version.

But what do they mean by canonical? One of the definitions of canonical is reduced to the simplest and most significant form possible without loss of generality.

What this means is that if you have a page–let’s take an e-commerce product page–and the simplest URL that you want it accessible by is:

http://www.site.com/category/product.html

you can add the canonical tag to that specific product. Google, Yahoo and Microsoft use this tag to tell their search engines which URL it should have for the current page.

Now, let’s say that the particular software you use also allows you to access the same product using:

http://www.site.com/company/product.html

and

http://www.site.com/different_category/product.html

Perhaps this one product is in multiple categories. With this tag in place when any of the alternate pages are loaded this tag notifies any search engine that this is really the same product as the page you defined in the canonical tag. So, you are still allowed to have the content available as generally needed (by categories, tags, or some other organisation system) and still avoid having the content duplicated and penalised.

To implement the canonical URL tag in your web application, you simply need to do the following inside the <head> section of the duplicate content URLs:

<link rel="canonical" href="http://www.site.com/category/product.html" />

As Google mention, this tag is a hint that they honour strongly. Google will take your preference into account, in conjunction with other signals, when calculating the most relevant page to display in search results.

This article has been tagged

, , , , , , , , , , , , ,

Other articles I recommend

Optimise Your URLs for Web Crawlers and Indexing

Many questions about website architecture, crawling and indexing, and even ranking issues can be boiled down to one central issue: How easy is it for search engines to crawl your site?

Google, Yahoo and Microsoft Webmaster Tools

The first step to increasing your site’s visibility on the top search engines such as Google, Yahoo! and MSN is to help their respective robots crawl and index your site. To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file. Conversely and importantly, webmasters can also notify the search engines about the existence and importance of pages with a sitemap.xml file

When to use Sub-domains versus Sub-directories and Microsites

The decision to utilise a sub-domain, sub-directory or even a microsite is simply an architectural decision, but one that is often compounded with a marketing decision. In general, sub-directories are used to describe what individual pages are about while sub-domains and microsites are used to describe what an entire site is about.