Paul Roub

blah blah blahg

Rolling Your Own URL Shortener

Assumptions: you have a Movable Type blog, and rights to the underlying database. That you can create a PHP script on the same server, and can edit either .htaccess or httpd.conf. Similar approaches should work fine with other blog software, other languages, etc. Yes, PHP sucks, etc… but on my particular server it’s the lowest-overhead scripting option, so it wins.

Anyway.

I noticed that on Twitter, Jeffrey Zeldman is using some sort of self-hosted URL shortener. See, for example, this tweet, wherein http://www.zeldman.com/x/28 is linked to, but you end up at http://www.zeldman.com/x/28.

I assume this is an attempt to fit URLs into Twitter without using third-party URL shorteners, which we all know will end life as we know it.

Since I’d been doing the same thing, I thought I’d share the script.

I’m cheating a bit — rather than build a general-purpose shortener, I’m just linking to blog posts by their internal Movable Type ID. Given a shortish domain name (roub.net), adding even a large integer keeps things plenty short enough that none of my Twitter clients tries to “clean it”.

An example: http://roub.net/552 leads to http://roub.net/blahg/archives/2009/06/reminder-aroma.html. 552 is just Movable Type’s unique ID for this blog post. A nice detail, from my point of view, is that all of my blogs (personal, work and openmikes.org) are in one database, with unique IDs across all. So one shortener works for all three, given a bit of care.

That care?

select  blog_id, entry_basename, blog_site_url, unix_timestamp(entry_authored_on) as stamp 
from mt_entry, mt_blog 
where entry_blog_id = blog_id and
entry_id = 552

‘entry_basename’ leads to “reminder-aroma”; ‘blog_site_url’ gives ‘http://roub.net/blahg’; and the ‘2009/06’ is parsed from the authored_on date. There’s minor goofiness related to one of my blogs’ URLs being formatted differently than others. See the full script for details.

So how does it get called? Via Apache’s mod_rewrite. In my case, I stuck it right in the site’s Apache config file. But it will work fine in .htaccess, with minor adjustments, assuming per-directory rewrites are enabled on your system. If they’re not, complain to your ISP. If they don’t know what you’re talking about, switch ISPs.

The Apache magic:

RewriteRule     ^/([0-9]+)      /findblog.php?id=$1 [L]

Which translates to “if we see a URL starting with some digits, hand the request to findblog.php, and give it that number we saw.” The script takes that, builds SQL as above, and issues a 301 (Permanent) Redirect to the blog URL. If it can’t find a match, doesn’t see a number, or in any other way gets confused, it returns 404 Not Found.

To make my life easier when publishing, I’ve also added the “short link” next to the permalinks on this blog.

Who’s done this for other blog software?