peter_zaitsev (peter_zaitsev) wrote,

MySQL: Loading large tables with Unique Keys

Normally MySQL is rather fast loading data in MyISAM table, but there is exception, which is when it can't rebuild indexes by sort but builds them
row by row instead. It can be happening due to wrong configuration (ie too small myisam_max_sort_file_size or myisam_max_extra_sort_file_size) or
it could be just lack of optimization, if you're having large (does not fit in memory) PRIMARY or UNIQUE indexes.

Below is the internal letter I've sent out on this subject which I guessed would be good to share

Today on my play box I tried to load data into MyISAM table (which was
previously dumped as mysqldump --tab)

The data was some 1.3G, 15.000.000 rows, 512MB memory one the box.
MySQL 4.1.8

One could could call it trivial fast task, unfortunately I had
unique key on varchar(128) as part of the schema.

The load took some 3 hours before I aborted it finding out it was just
about 20% done.

The problem is - unique keys are always rebuilt using key_cache, which
means we're down to some 100-200 rows/sec as soon as index becomes
significantly larger than memory.

You can't go away with ALTER TABLE DISABLE KEYS as it does not affect
unique keys.

Actually even Innodb with all its transactional overhead and inability
to repair keys by sort can do better if SET UNIQUE_CHECKS=0 is used.

What I would guess could be done for MyISAM ?
At least it could support UNIQUE_CHECKS=0 leaving on user
responsibility if it is really so. If I'm restoring from backup I'm
pretty sure about that.

If we want neat solution check if key is really unique could be done
after data is loaded. It is much faster, especially for physically
sorted key we end up with.

The workaround which I found so far is really ugly, however I've seen
users using it with good success. You can create table of the same
structure without keys, load data into it to get correct .MYD, Create
table with all keys defined and copy over .frm and .MYI files from it,
followed by FLUSH TABLES. Now you can use REPAIR TABLE to rebuild all
keys by sort, including UNIQUE keys.
  • Post a new comment


    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic