Canonical URLs: Telling Search Engines which to Use

For any given website page, there are more than one URL that can be used to get to it. While this is extremely useful in certain situations, like when you want to view your page before the domain propagates, it sometimes can cause undesired effects.

Luckily most search engines, like Google & Bing, understand every page has multiple working URLs, so they usually correctly determine what pages are duplicates, and what URL is most likely the canonical URL. The canonical URL is the preferred URL for a page.

Examples of URLs for a Single Page

For example, depending on your setup, it may be possible to visit the same page by using the following example URLs:

  • primarydomain.com
  • www.primarydomain.com
  • 10.0.0.2/~username/
  • primarydomain.com/index.php
  • www.primarydomain.com/index.php
  • 10.0.0.2/~username/index.php

If the domain is an addon domain, it may be possible to view the same page through these hypothetical URLs:

  • addondomain.com
  • www.addondomain.com
  • primarydomain.com/addondomain.com
  • addondomain.primarydomain.com
  • 10.0.0.2/~username/addondomain.com
  • addondomain.com/index.php
  • www.addondomain.com/index.php
  • primarydomain.com/addondomain.com/index.php
  • addondomain.primarydomain.com/index.php
  • 10.0.0.2/~username/addondomain.com/index.php

How Search Engines Guess the Canonical (Preferred) URL

First of all, it is important to note that even though all of the URLs exist for the same file, most search engines and your visitors will never encounter them and will not even know they exist. So, for example, that temporary URL your host gave you with your IP address and username in it will only be known by you, unless you tell someone else about it.

Of the ones it knows about, they only know it exists because someone told them about it, usually by using it in a link on a web page somewhere.

Search Engines usually find URLs a couple of ways:

  • They found a link to your page on a web page they already knew existed.
  • The found the link / URL in a site map and/or RSS feed.
  • The link / URL was submitted to them directly, usually via their website.
  • Somone visited your page while using their browser toolbar.

Once they find out about the page, they compare it will other pages that appear to be identical or almost exact matches to spot duplicates. If they spot a duplicate page, they then try to figure out which URL should be the canonical URL.

Although they keep their exact algorithms secret, there are some things that they are known to check:

  • What URL everybody seems to be using when linking to you.
  • What URL is used in your site map and RSS feeds.
  • Whether you specified a canonical URL in the meta tags of your pages.
  • For Google, whether you specified a canonical URL in Google Webmaster Tools.
  • Whether the URL redirects to another URL.

Was this answer helpful?

 Print this Article

Also Read

How to Delete Your "Cookies" and "Browsing History"

A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is a small piece of data...

Why do I see a “503 Service Unavailable” error?

If you see a “503 Service Unavailable” error message on your Web site, it means that the site has...

I cannot see video in knowledgebase

To view video and swf files, you will need to ensure the following is updated on you computer:...

How To Set Up Your Merchant Account using PAYPAL

To open a paypal account, go to paypal.com and register as a STANDARD BUSINESS account.  As a...

Does My HbCommerce Store Uses Cookies

Some HBCommerce features, such as logging in and installing third-party apps, require your...

Powered by WHMCompleteSolution