MySQL database dump occasionally exceeds limits

Scott Woods 16.Aug.2011 8:21pm
Hi there,

This is really more of a MySQL issue than a Webiva issue at heart, but it comes up with some webiva internal tables, so I’m posting it here, and am hoping for some insight.

Our sysadmin writes:

= begin quote =

Unfortunately, the database dump cannot be restored as is. I attempted to do so on a test MySQL server elsewhere (not the production db server), and get the error:

ERROR 1071 (42000) at line 1912: Specified key was too long; max key length is 1000 bytes

Googling this, there’s a longstanding design limitation in MySQL that hasn’t been rectified, at least not in the version of MySQL we’re running. Basically, you can’t have indexes longer than 1000 bytes. UTF-8 encoding can use up to 3 bytes per character, so if you have a long varchar column and index it, the additional data in the index added to the variable length can blow past that 1000 byte limit. Apparently, what typically happens is that indexes that don’t blow out the limit are created on latin1 tables/dbs, then the db or table is converted to UTF-8 after the fact. Latin1 only takes 1 byte/character, so when the index is defined it’s okay, but after conversion to UTF-8, and a subsequent dump and restore, the index definition in the restored db is too long.

The error in this case is thrown by this CREATE TABLE definition:

CREATE TABLE `end_user_actions` (
`id` int(11) NOT NULL auto_increment,
`end_user_id` int(11) default NULL,
`admin_user_id` int(11) default NULL,
`level` int(11) default ‘1’,
`created_at` datetime default NULL,
`action_at` datetime default NULL,
`target_type` varchar(255) collate utf8_unicode_ci default NULL,
`target_id` int(11) default NULL,
`custom` tinyint(1) default ‘0’,
`renderer` varchar(255) collate utf8_unicode_ci default NULL,
`action` varchar(255) collate utf8_unicode_ci default NULL,
`path` varchar(255) collate utf8_unicode_ci default NULL,
`identifier` varchar(255) collate utf8_unicode_ci default NULL,
PRIMARY KEY (`id`),
KEY `user_date_index` (`end_user_id`,`action_at`),
KEY `action_index` (`renderer`,`action`,`action_at`),
KEY `target_index` (`target_type`,`action_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
SET character_set_client = @saved_cs_client;

The index on ‘renderer’ and ‘action’ apparently blows up.

One possible workaround is to edit the db dump and change the UTF-8 references to latin1, but I’ve no idea on how that will affect Webiva.

= end quote =

He wrote back a bit later to say that the conversion to latin1 allowed the dump to restore. But the problem is that we really don’t understand the full ramifications of this to webiva. Since it’s on renderer/action columns, and this is an entirely U.S. website that’s being restored, I wouldn’t think we’d have problems, but we’re not really sure. Do we leave it latin1 afterwards? Convert back to UTF8 after the restore?

Thanks in advance, any thoughts on the matter would be appreciated!

Scott
Pascal Rettig 16.Aug.2011 8:40pm
Hi Scott,

This is an issue with us not being careful when creating the tables – the indexed varchar(255) columns (renderer, action, target_type) can be pulled down to varchar(64)’s which should solve the indexing problem for mysql – none of those fields should exceed 64 characters and reducing the size should fix the index (64 * 3 * 3 = 576 bytes for the indexes, ignoring the dates)

If you just change the sql import it should import fine and run the appropriate alter commands on your existing databases.

We’ll probably modify the existing migrations at some point – I haven’t run across that issue before though, so it may be a specific version of mysql.

-Pascal
Scott Woods 16.Aug.2011 8:49pm
Great info, thanks Pascal!

For what it’s worth, this system is running:

CentOS release 5.6 (Final)
mysql Ver 14.12 Distrib 5.0.77, for redhat-linux-gnu (x86_64) using readline 5.1