Posts tagged with 'seo'

Rev-canonical should be handled with care

A short while back, Google and a bunch of other search engines launched @rel="canonical", a standard for specifying that the current page is a copy of another, more canonical, version hosted elsewhere. I blogged about it at the time , and generally approved of the idea but warned against overuse when an HTTP redirect might be more sensible.

Recently there's been a large amount of discussion about @rev="canonical" , a proposal that seems to have been floated with the intention of providing URL shortening services. The idea is that my page can 'advertise' some other URLs that it can be found at so that clients can pick a different one to use when referring to it.

In this particular use case I could publish a page at http://ciaranmcnulty.com/blog/2009-04-14/a-long-blog-post-with-a-complex-url that had a @rel="canonical" link to http://ciaran.ws/complex (I don't really have that domain, don't bother trying it). Applications that wanted a shorter URL for the content (e.g. Twitter clients, SMS gateways) could then use my shorter URL rather than having to get a more obfuscated one from TinyURL or somewhere similar.

The number of sites that have already included the markup is staggering in such a short time, and a testament to how a simple markup idea like this can really take off (if only Microformats could gain this kind of uptake!). I've been reading a lot of the commentary that's bouncing around the HTML blogosphere, and thought I'd put my £0.02 in. Frankly, I fail to see the point of all the hooh-hah, for the following reasons:

Rel-canonical should be handled with care

Something we've been telling clients for years is to not publish the same information in more than one place. There are many reasons for this from the point of view of web semantics, but the one that makes the clients listen is when we say that Google will penalise their site for it.

As of today Google allow duplicate content as long as you indicate clearly which version is the canonical one. This entails adding something like the following to the HEAD element in your duplicated page, pointing back to the original:

<link rel="canonical" href="/the-other-page" />

This approach has been welcomed by many, but I'm fearful that it is duplicating already-existing web semantics as well as encouraging bad habits in web authors.

Keeping querystrings clean with Zend Framework

I'm something of a zealot about short, readable URLs. Most people by now are using server rewrite rules and Front Controllers of some sort to keep the paths in their application sane and legible, but an area that's often overlooked is the querystring, especially after a form submission.

A typical example of a situation with a 'messy' querystring might be a search form on site. When a form is submitted, all the elements in the form are entered into the querystring whether they're relevant or not.

That leads to querystrings like /search?size=&shape=&colour=red&age=, which would be a lot more legible as /search?colour=red

On some sites I work with, an 'advanced search form' can have up to 20 elements, all of which might get submitted in one go leaving the search results page with a huge URL.