Fix for Nodewords module's faulty canonical tag feature

Tags: 

The Drupal module Nodewords is a module that many people have come to love-to-hate - its SEO features are second to none, but a few buggy releases have left a sour taste with many developer.

A key problem with the current stable release is that the canonical URL support is a little faulty.  It can be a problem when sharing content across multiple sites, or allowing other sites to display your content via an RSS feed, that search engines might find two copies of your content, one at your primary site and one elsewhere.  The canonical URL system was developed so that you could add a tag to your page to tell search engines "this URL is the official URL for this piece of content" - simple, and with CMS support, completely painless.

It's a standard feature in Drupal (using the built-in Path module) that you can create friendlier URLs for all of your pages, so instead of "http://example.com/node/123" you can have "http://example.com/products/chicken".  With a combination of modules (PathAuto, Path_Redirect, GlobalRedirect) you can set your site up to automatically create friendly URLs for all content to match specific structures (e.g. products always look like "products/product-name"), let you add aliases for common misspellings or when you rename a page ("products" instead of "product"), and automatically bounce the user over to the correct path regardless of which version they typed.

So taking those two together, the Nodewords module should be using the friendly URL alias to indicate the canonical URL.  Except it doesn't, by default the latest stable release just uses the internal "http://example.com/node/123" format.  A ticket to fix this was added to drupal.org and I provided a patch that gave the site administrator a simple option to decide whether to use the internal path or the alias, but so far the module maintainer has only said he intends to handle this via a completely different structure in his forthcoming rewrite.

While we at Bonnier Corp have been using a patched version of this module with the fix for several weeks, I thought it might help others to test it out and see which they prefer to use without having to deal with applying the patch themselves.  Towards that goal, here's a zip archive of the most recent v6.x-1.11 release with the patch applied so you can make your canonical URLs nice and friendly.  If it works for you, please chime in on the issue on drupal.org.

Attachments: 

6 Comments

I think this may be the wrong

I think this may be the wrong place to manage canonical tags. Nodewords doesn't decide which is the primary piece of content, Global Redirect does. And so since GR knows what is the primary content, and Nodewords doesn't, GR should be handling the canonical tags.

That's an interesting point..

That's an interesting point.. except the community have already pronounced Nodewords as the module that shall control this functionality (at least in d6): http://drupal.org/node/373971

And so since GR knows what is

And so since GR knows what is the primary content, and Nodewords doesn't, GR should be handling the canonical tags.
real soft

I hope that some solution is

I hope that some solution is found soon for the faulty Nodewords of the Drupal module as it was such a useful tool for the developers with some real cool SEO features!! Now that a new patched version of this module is being used with a fix, let us hope that we will be able to over the problem that we were having with the canonical URL with was behind all the problems we were having!! angies list

What no one seems to have

What no one seems to have mentioned in all of this is the reverse scenario.

when using feeds to import nodes, I want to be able to set the canonical URL to the original url of the piece I have imported.

It would seem to me that the easiest way from a user perspective would be to have an option

1. Use "node/nid"
2. use url alias
3. use a token (and get feeds to provide a token).

How to reply

Care to add your own 2 cents? Let me know via Twitter or my contact page.