I rant against Drupal community

I like Drupal (the engine running this site), but rather often they just don’t care about what truly matters. So, beside still not having decent forums, working trackbacks, configurable rss feeds, the archive module and a decent search engine, now we also get corrupted databases “by default”. How fun.

You know, I use to dump the DB in order to back it up for safety reasons. Well, the current Drupal install happily corrupts itself right as you try to dump it. Enjoy your backup.


This is a rather serious problem, from my point of view, that I foresee getting rather widespread if not solved quickly. After a few hours of research and an headache here’s what I discovered:

– Mysql 4.1.x adds the possibility to set a collation for the database. This seems a new feature that wasn’t there before.

– By default it seems that every database created or imported is automatically set to “latin1_swedish_ci”. The whole database is set with that collation as you install drupal under that version of mysql or import a previous dump.

– This is causing a serious corruption in the database while exporting it because accented and other utf-8 characters are just NOT COMPATIBLE with the latin1_swedish_ci set.

– This means that if you install drupal on mysql 4.1.x, the very first time you export the database for a backup or whetever else, you’ll finish with a corrupted dump because all the accented characters in the nodes, comments and aggregator items will get replaced with GARBAGE. As -> “Saturday’s Teen People” in the place of “Saturday ’s Teen People” This is taken from my now broken database and since I noticed this too late I now cannot do anything if not manually change every single entry. How fun.

– In the handbook, install.txt and all the other install guides for Drupal THERE IS NO MENTION of the collation. This means that it’s written nowhere how to set the collation and so everyone just follows the standard instructions and finishes with a “latin1_swedish_ci” as it happened to me. Including the text in the aggregator items, node and comment bodies.

– How the hell the DB must be configured now? Because from what I read here it’s not even possible to set Drupal to use utf8 because it’s still not compliant.

So, beside having my database now unrecoverable, how should I set it to have it working properly from now on and be able to back up it without getting unrecoverable garbage text?

Then I seriously suggest you to patch the guides and the drupal package to stop this or it would become a rather large problem considering that following step by step the instructions you unavoidably go toward this corruption problem.

– HRose / Abalieno


EDIT – I somewhat managed to obtain non-corrupt dumps. Or at least from now on since the database is still screwed. The “MySQL connection collation” field on the home page of phpmyadmin was set to utf8, conflicting with the database that was set to latin1. Now I have both set to latin1 and it’s better. But the database has still to store odd characters that are then converted when the page loads. Works for now.

Posted in: Uncategorized | Tagged:

Leave a Reply