peter_zaitsev (peter_zaitsev) wrote,

MySQL: Power of covering index

One of very efficient optimization method of "when nothing else helps" is using "covering index". Being pretty simple
this method is still rarely remembered about.



So what is covering index ?

Lets say you have the query "select b from tbl where a=5" Assuming column "a" is indexed and table is not clustered
by this column MySQL will need to perform at least 2 reads, at first to read index page to find row pointer and when
to read the row itself. If you would have "covering index" (a,b) instead of indexing just column "a" you will have only
index read needed, so you could expect 2 times performance improvement, sometimes a lot more.

When covering index helps a lot:

- When you have large tables. If your tables fits in memory this extra read required is inexpensive
- When you have long rows (ie with BLOBs), in such case you can have order of magnitude performance improvement
- When extra columns do not increase key length a lot.
- When you have large join with a lot of lookups on second table (see below)
- when a lot of rows match the same key value (by reading single key block you save tons of random IO)
- MyISAM tables benefit more than Innodb as MyISAM does not cache rows (only in OS cache) so random data IO is CPU intensive.

Things to watch:

- Innodb tables already clustered by PRIMARY KEY so all primary key lookups are already sort of "covered"
- If column "b" is frequently updated updating covering index on updates can be expensive
- Only full indexed columns work, if you will index column prefix, index will not be used as "covering index" for this
column.
- If you have large range scans or index scans they can slow down by increasing index length
- Watch out for adding VARCHAR part to INT index to make it covered. MySQL will pack such keyblocks and
your lookups can become unexpectedly slower
- Make sure MySQL actually chooses covered index, especially if you have several indexes with same prefix
- If there are no matching rows for the key, covering index does not help.


Example:

Here is example from BenchW (http://sourceforge.net/projects/benchw) which I worked on optimization today.

No Covering index:

mysql> explain SELECT d0.dmth, count(f.fval ) FROM dim0 AS d0, fact0 AS f WHERE d0.d0key = f.d0key AND d0.ddate BETWEEN '2010-01-01' AND '2010-12-28' GROUP BY d0.dmth;
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
| 1 | SIMPLE | d0 | range | dim0_d0key,ddate | ddate | 3 | NULL | 334 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | f | ref | d0key | d0key | 4 | benchw.d0.d0key | 1000 | |
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

Time: 1.94 sec


Query with key (dkey0) expanded to "covering index" (dkey0,fval0)

mysql> explain SELECT d0.dmth, count(f.fval ) FROM dim0 AS d0, fact0 AS f WHERE d0.d0key = f.d0key AND d0.ddate BETWEEN '2010-01-01' AND '2010-12-28' GROUP BY d0.dmth;
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
| 1 | SIMPLE | d0 | range | dim0_d0key,ddate | ddate | 3 | NULL | 334 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | f | ref | d0key | d0key | 4 | benchw.d0.d0key | 1000 | Using index |
+----+-------------+-------+-------+------------------+-------+---------+-----------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

Time: 0.66 sec


Note, in this case we had working set in memory completely. In disk bound case I would guess larger difference.

"Using Index" in Extra column is what tells us "covering index" is being properly used for this table.
  • Post a new comment

    Error

    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic
  • 3 comments