Migrating from Blosxom to WordPress

So I decided to move from Blosxom to WordPress, first for my personal blog and then for my tech blog. And since I had about 550 postings and around 40 or so comments in my personal blog I needed a way to migrate my data. Googling didn’t find anything even remotely useful (the “import via RSS” suggestions simply lost too much formatting - things looked terrible, given how many pictures I have). Instead I figured I’d write a perl script that would do the hard work; pull all the postings and comments from Blosxom and import them into WordPress. Looking at the structure of the existing import scripts (and the fact that I know far less php than perl) I decided not to integrate this into WordPress but instead to insert the data directly into the mysql database. That should be fun. And amazingly it took not nearly as long as I feared!

Now I want to share what I learned with the rest of you, but the more I look at the script that I wrote, the more I realize that it is based on so many assumptions that it might be almost useless to anyone else. But then again, maybe it can help someone in a similar situation as a starting point. Writing it certainly helped me understand why WordPress doesn’t have an import function for Blosxom.

Here’s the fundamental idea of what I did

  • install WordPress on the target system. One assumption made in the script is that you can access the mysql database from a system that has the Blosxom files accessible in its file system.
  • set up the new blog. Depending on your needs you may have to find (or write) a theme that is similar to your Blosxom theme. In my case (the personal blog, not this one) the formatting of many of the postings was based on this being a fixed width theme of a certain width with certain classes defined in the CSS, certain margins set around different HTML objects, etc. So I started from something reasonably similar and then more or less wrote my own theme.
  • delete the default posting and comment, make any other changes you want (blogroll, etc) and set up all your categories (important - the script will fail if it finds a category that doesn’t exist).
  • back up your database
  • I mean it. Use the wp-backup plugin. Or do it manually in mysql. But back it up. Really. I restored this backup quite a few times while working around bugs in the script, typos in the blog postings, etc.
  • download the blosxomtowp.pl script.
  • read the script. Edit the variables at the top of the script. Look through the assumptions made. Here are the ones I’m aware of, but you really might want to read through the script and compare with your file system layout, posting structure, etc.
    • it assumes that you have shell access to the machine that blosxom runs on and that you can connect to the WP mysql database from that machine
    • it assumes that you use directories under the main blosxom blog directory for your category hierarchy - just as with using the “categorytree” plugin
    • it assumes that you use the “meta” and “metadate” plugin to set the date on your postings (but it’s easy to change this to use the file time stamp instead - I just haven’t done that)
    • it assumes that you are using the “feedback” plugin for comments (but I think “writeback” and some others have similar file layouts and formats)
    • it assumes that you have already created /all/ categories that you have in blosxom in your WP database
    • it assumes the database table layout in WP-2.0.5

    Figure out what else you want to preserve (assuming you have different plugins than I had). Figure out what you can live without.

  • you did back up the wp database, right?
  • go to the main directory of your blosxom tree and run the script on one posting
    …/path/blosxomtowp.pl misc/aposting.blog (note that I used “.blog” as suffix - for most people that will be “.txt”).
  • check your blog in a web browser. Did the posting show up? Does everything look right?
  • start debugging. wp-phpmyadmin was a huge help for me to see what went wrong in the mysql database
  • once this works for a few postings you can slurp all of it in (don’t forget to restore the backup, first, so you don’t get duplicate postings):
    find . -name \*.blog | xargs …/path/blosxomtowp.pl

I’m sure I’m forgetting a lot of things here. Please comment if you have additions, improvements, suggestions. The script is under the GPL, I’d be happy to accept fixes from anyone, but especially from people who actually are better at writing perl than I am (that’s not a high hurdle) and who can help me clean up the code.

Thanks for visiting!
I hope this was helpful - if not, please leave a comment and let me know why! Were you searching for something else? Did I miss an important aspect?

1 Comment so far

  1. [...] from Blosxom to WordPress and keeping the subcategories intact. Neither Jason Clark’s, nor hohndel.org’s worked without a lot of work. I’ve had no luck with Wordpress’s own wiki or the [...]

Leave a reply

FireStats icon Powered by FireStats