The canonical URL is a tag that allows you to indicate to a search engine the "main" URL to be taken into account for indexing, when the same content is present under different URLs, thus avoiding letting the engine choose for you.

Contents

When it comes to optimizing the content of a website, the problem of duplicate content is one of the most common difficulties encountered by webmasters, e-tailers and other SEO specialists. The 301 redirection is then recommended as a potential solution. But this is not always possible to implement, or is sometimes not appropriate. This is the case, for example, if the redirection leads to a slower page loading time, or if users actually need to visit the page that contains duplicate content. This is where the canonical URL comes into play, via the rel="canonical" attribute, which solves most duplicate content problems.

Canonical URL: definition

Canonical URL: definition

Duplicate content occurs when two web pages, with different URLs, present identical content. As a reminder, the URL (Uniform Resource Locator) designates the unique address in the format " www.example.com " or " http:example.com/content " corresponding to an internet page. In the management of your website, it is harmful for your reputation to have two different pages with the same content.

Duplication can be caused by a bad structure or a bad set up of your webpages. Generally, search engines analyze websites according to the principle that, for each URL there is only one unique content. When duplication occurs, the immediate consequence is that this analysis is biased. As an example, when searching via the Google search engine, the two pages with duplicate content will be competing with each other. Obviously, this is not suitable for a good SEO.

The canonical URL tells the search engines what the “the official address” is for each page. engine therefore understands that it must index the page under this precise address. Any other version will simply be considered as the same page. This webpage component can therefore be seen as an element that tells search engines which page is the most representative in a batch of duplicated pages.

How to optimize canonical URLs?

The optimization of canonical URLs requires the identification of duplicated content and the understanding of the causes of this duplication. There are several tricks to achieve this. For example, you can check pages generated by users’ comments, since search engines also display results by considering comments as content. Also, do a quick Google or Bing search to check if the number of indexed pages of your website is greater than the number of articles or pages published. Then, take a look at your categories (if you have any) and make sure that no article is in two categories at the same time. Finally, look at your site's redirects, to make sure they are all set up properly. Once you've done these checks, now make sure that the official URL for each page is optimal.

To optimize a canonical URL, you need to make sure you follow the basic rules that govern URLs in general. In doing so, you should also keep in mind that you should give the preferred URL every chance to be seen by search engines. This means, for example, including a keyword. This principle, as trivial as it may seem, plays a role in SEO. At the level of Google's algorithms, it is more logical that an address contains the keyword for which the page should be indexed. When inserting this key expression, be careful not to insert any accent or sign that is difficult to interpret for search engines.

Adding a term to the canonical URL is of course not enough to optimize it, and should not be considered as the ultimate solution. It is also important to make sure that this URL is not too long, so that it can be easily shared on social networks and forums. On this point, the site tree structure is part of the elements that influence the length of a URL. If two pages on your site have identical primary titles, and their URLs are therefore similar, the presence of the tree structure in the address can help differentiate them. But you don't need to include the entire category chain; limit yourself to one or two elements that differentiate these two pages.

Also, to optimize the canonical URL of a page on your site, make sure it includes word separators, but separators that are easy for search engines to interpret. As a general rule, the dash (-) is the simplest separator. Signs such as the comma or semicolon should be avoided, while the ampersand (&) or hash (#) should be reserved for identifying dynamic areas on the page.

Apart from these details, you will also have to go to Google's Search Console to tell this search engine which version of your site's address should be considered as canonical for your domain. Once you've gone through all these specifics, then it's time to move on to the actual setup of a canonical URL.

How to set up a canonical URL and what added value does it bring?

Once you have configured the canonical address spelling that you wish to prioritize for a given content on your site, you must now set it up by choosing the method that suits you best.

1st method: in the HTML code

The use of this method consists in placing the tag, with the appropriate attribute, in the head header. This tag must be added to all the pages with identical content, so that they all refer to the page associated with the canonical address. To do this, between the "head" and the "/head", insert the following code:

<link rel = »canonical » href= »www.urlcanonique.com »>.

url-canonical-html

2nd method: in the HTTP header

Depuis juin 2011, il vous est aussi possible de paramétrer en procédant à l’ajout d’une ligne dans l’entête HTTP. Pour ce faire, il vous suffit d’adopter le format Link: <URL>; rel= »canonical », expression qui est à insérer au niveau de l’entête.

url-canonical-entete

3rd method: in the sitemaps

The sitemaps should contain all canonical URLs of your website. Thus, you can add the addresses that you define as “official” for various pages.

In summary

You do not have to use any of these methods to define canonical URLs for your duplicate pages. However, by doing so, you ensure that you maintain control over the most appropriate URL to link to a given content on your site. Because, in the absence of a canonical URL, Google simply chooses the URL it deems best according to its criteria. And these criteria are not necessarily the same as yours.

Rel= "canonical use cases

In addition to the difficulty or inappropriateness of 301 redirection, several other situations may make it necessary to adopt canonical references. This is the case, for example, when you have several pages on your site for products belonging to the same series, and which differ only in size or color. If this happens, it is better to choose a single page whose URL will be optimized with the tag link rel= "canonical ".

url-canonical-inidexable

The canonical URL can also be used if you have a product sorting system, with the sorting criteria appearing in the address. In this case you will have to insert a link to the canonical URL on all new pages generated by the sorting. You can do this using the first setup method mentioned above. Other situations that may lead to the use of the canonical tag are identical content on multiple domains or publication of the same content in multiple languages. For each of these contexts, it would be appropriate to introduce rel="canonical" in the header of the main page.

Canonical URL: mistakes to avoid

Error n°1 : the same canonical URL for different contents

It can happen that, by mistake, you configure the same canonical URL for too many pages on your site, even though their contents are different. This mistake could damage your overall SEO, since some pages will be less visible.

Error n°2 : several canonical links for the same page

It is also not uncommon for some users to set up multiple canonical URLs for the same page. This could simply lead search engines to choose a canonical URL, since they will only remember one, which is not necessarily the most optimized one.

Mistake #3: a non-indexable canonical

For better ranking, you use canonical URLs. It would be a bit silly if these were not indexable, wouldn’t it?

Error #4: using canonical URLs on pages with pagination

Sometimes, when setting up content to be displayed on several pages, you may be tempted to put a canonical address on the first page. However, as long as pagination is in place, each page must be indexed. This error would prevent all subsequent pages from being indexed.

Error n°5 : Reversal of roles between the favorite page and a secondary page

Another possible mistake is to reverse the roles. In other words, the page indicated as the canonical URL refers to the page that is supposed to be secondary and on which its address is added as canonical.

Case 1: Missing tags

First, analyze cases of pages that do not contain the tag.

  • If there is none: good, there is nothing to do.
  • If there are: find the number and type of page(s) concerned. If they are pages to be de-indexed (shopping cart, etc.), check that they are de-indexed or proceed with their de-indexation. If not, you must place a canonical tag " self-referencing ", that is to say pointing to the page in question, to indicate to Google that it is the original page and to prevent duplicate content.

Balise canonical self-referencing : si mon site internet est : www.monsite.fr ma balise sera :  <link rel= »canonical » href= »https://monsite.fr/ » />.

Case 2: existing tags

Logically, the site crawl should allow you to calculate the following equation:

Total nb pages = Nb missing canonical + Nb present canonical
And Nb present canonicals = Nb self-referencing canonicals + Nb “referencing other” canonicals

==> You are therefore interested in the canonical tags present to analyze their type.

Case 2.1 : as many canonical tags as self-referencing ones

This means that all the tags present on your site point to the page itself: this is the most frequent use of this tag.
There is nothing to do, unless you find yourself in one of the cases justifying pointing the canonical to another page. For example, if your site uses an article that is already present elsewhere, for example to give visibility to partner bloggers, you must place a canonical tag pointing to the original page.

Case 2.2: fewer self-referencing canonicals than the total nb present canonicals

This means that you have canonical tags that point to other pages and not to "themselves".

  • if it is a question of pagination: it is very frequent, if you have paginated pages, that all the pages after the first one have a canonical pointing to page 1. In this case, we advise you to modify this operation by putting self-referencing canonicals. This allows Google to explore and take note of the content of all the pages.
  • otherwise: are they duplicate pages? If they are duplicate pages that must exist (example of case 2.1 above of content syndication), then keep the canonical. If on the other hand, the pages should not exist as duplicates, such as a change of URLs for example, then it is not the canonical tag that should be used, but a 301 redirect from the old URL to the new one.

Indeed, the 301 redirection allows you to send the traffic from the old page to the new one, and to control the internal linking (internal links) of your site, by avoiding creating identical pages without any need for the user.

Solutions/tips for using the canonical URL properly

Several tips can help you get the most out of the canonical URL optimization of your websites. Among others, if you use a CMS, you should for example make sure to configure canonical URLs properly. Most of these website management tools offer plugins or modules for this purpose. This is the case of Frontend SEO or EFSEO - Easy with Joomla, or Yoast SEO or All in One SEO with WordPress. We can also mention Metatag for Drupal.

In addition, it is recommended to opt for absolute addresses when setting up your canonical URLs. In fact, in addition to the risk of error related to relative references, there is also the risk that some search engines have trouble interpreting them. In the case of canonical URLs referring to an error code, in particular a 404 error, it is true that they will not prevent the indexation of the concerned page. The search engines will automatically look for a URL to assign to it as canonical. But since you will no longer have control over the optimization of your site, it is essential to check your canonical addresses carefully to ensure that they are working properly. To do this, you can first explore all the internal links on your website. Then, you can extract a list of URLs, from which you will focus on the ones that respond correctly.

Once you have this result, extract the canonical URL for each one and compare it to the original URL obtained during the site exploration. If the two are identical, everything is fine. If they are not, make sure to indicate the canonical URL on the page in question. This phase of verification also allows you to determine the possible canonical address errors. It can happen that these addresses lead to error codes. It is therefore important that you solve this problem immediately. Also, avoid having a canonical URL assigned to a page with a different content.

Conclusion

You now know how to set up a canonical URL to avoid duplicate content and how to optimize them to improve your SEO.

class="img-responsive
   Article written by Louis Chevant

Further reading

The complete guide to Internal Meshing

The step-by-step method to build your semantic cocoons, your mesh and the optimal tree structure of your website.