What is a canonical URL?
A canonical URL is a URL that has been set by the owner of a website as the master copy through canonical tags. This could be due to similar pages on a website having duplicate or near-duplicate content, or to better clarify to Google which pages to crawl, index and return to users in the SERPS (Search engine results pages).
Why are canonical URLs important?
If you have duplicate content on your website or pages that have content speaking to the same topic, it is highly recommended to use canonicalisation to help mitigate your site being flagged for duplicate content, this can massively impact your SEO performance.
Google will only index canonical URLs, so it’s good to use your canonical tags where necessary to help guide Google’s crawlers to the pages that you want to be seen and shown to your users. In the absence of a specific canonical URL, Google will attempt to use its best judgement as to which page is the correct one to index and promote in the SERPS to your users, however, it may choose the wrong one.
Google’s very own John Mueller has promoted the use of canonical tags on your website to make it easier for your site to be found, crawled and indexed, so it’s not a bad idea to follow this advice.
How do you set a canonical URL?
When Google crawls your website, it will look at a few different things to determine the canonical URL, or master page for a set of pages that have duplicate content or near duplicate content, one of these things is the canonical tag HTML code in the <head> of the page.
The HTML code added to the <head> of your webpage to determine the canonical version looks like this:
- <link rel=”canonical” href=“https://digitalnomadshq.com.au/canonical-page/” />
Canonical URL Best Practices
Canonicalization is a complex and technical topic, but most website owners need only know a handful of best practices. So to keep things simple, we’ll cover just a few of them here.
Self-referencing canonical tags
Self-referencing canonical tags are canonical tags that point to itself as the primary page in a set of pages.
For example, this page has a self-referencing canonical tag that looks like this:
<link rel=“canonical” href=“https://Digitalnomadshq.com.au/blog/canonical-url/” />
A self-referential canonical tag tells Google that you consider a particular URL to be canonical and that you’d like Google to index that specific page over others. Indexing isn’t always guaranteed, but the canonical tag is one of the strongest signals Google uses to understand what is and isn’t canonical on your website.
Exclude non-canonical URLs from your sitemap
Google says you shouldn’t list non-canonical URLs in your sitemap because it sees these URLs as suggested canonicals.
As with canonical tags, this doesn’t necessarily mean that Google will always treat a URL in your sitemap as canonical—but it acts as another strong signal to help Google better understand how you would like your site to be viewed and indexed.
One quick way to audit your website’s canonical URLs is to use Google’s free tools in Google Search Console.
Don’t Canonicalise URLs that are 404s
When a page has been deleted or taken offline, you will get a 404 status code when you try to access this page.
When a page is not accessible to users because of the 404 status, it means it’s also not accessible by Google’s crawlers. This means that the canonicalisation tag that you have on that page won’t be accessed, so you’ll lose the ability to utilise this pages authority and link consolidation, which will hinder your SEO performance.