Posts from February 2009

Rel-canonical should be handled with care

Something we've been telling clients for years is to not publish the same information in more than one place. There are many reasons for this from the point of view of web semantics, but the one that makes the clients listen is when we say that Google will penalise their site for it.

As of today Google allow duplicate content as long as you indicate clearly which version is the canonical one. This entails adding something like the following to the HEAD element in your duplicated page, pointing back to the original:

<link rel="canonical" href="/the-other-page" />

This approach has been welcomed by many, but I'm fearful that it is duplicating already-existing web semantics as well as encouraging bad habits in web authors.

Combining bordering ranges of data in MySQL

A while back, a friend of mine was working on a database that contained bookings for some equipment, and he needed to separate it out into blocks representing when it was free and busy, regardless of whether the busy times were bookings by the same people or not.

Just today, a different friend who works with mobile phones was dealing with a table that contained IP ranges and which network operator owned them. He commented that he suspected a lot of the ranges overlapped or adjoined each other, and could be combined together into larger blocks.

Both of these problems are ones of finding contiguous (or, bordering) ranges inside data.

It's tempting to try and approach the problem in an iterative way - find a block, try and find blocks border it, and work outwards from there but it's possible to solve this sort of thing in a non-iterative way with a single SQL query, which is what I'd like to show you how to do using MySQL.