peter_zaitsev (peter_zaitsev) wrote,

MySQL: Innodb Mutexes

Looking how Innodb's own mutexes are implemented, it is not as simple as I previously thought...

What always surprised me is this data for single CPU box:

OS WAIT ARRAY INFO: reservation count 104200, signal count 103478
Mutex spin waits 127988, rounds 1985935, OS waits 11934
RW-shared spins 160197, OS waits 76819; RW-excl spins 18523, OS waits 15430

I was surprised how there could be most mutexes resolved via spinlock (OS waits are much less then spin waits) if spinlock theoretically can't work on signle CPU as there is no one else to run while you're looping, unless context switch happens,
which should be pretty rare.

Digging into this I found out Innodb does not call spinlock traditional spinlock. Innodb Spinlock does not only spins the loop but also does pthread_yield() in the end of it if mutex is still not locked. This is also sort of "OS Wait" but it is not accounted as such in the stats.

What is called "OS Wait" is done if yielding control for a while does not allow to get mutex next time thread is scheduled. In this case Innodb waits on condition variable (having large array reserved for this purpose).

It is also interesting Innodb broadcasts condition to wake up all threads to avoid them getting mutex in FIFO mode, which
typically is not what you want as 2 thread getting the same mutex in the loop over and other would not ever complete in such case.

We're now looking to play a bit to see if all these complications improve performance as well to add some tuning possible
to adapt for different workloads, for example you probably would not want spinlock running tight loop on single CPU boxes.

Also we'll be adding more performance data gathering in MySQL 5.0 so one will be able to see hot mutexes easier.

If someone is interested in implementation details sys0sync.c is the file to take a look.
  • Post a new comment


    Comments allowed for friends only

    Anonymous comments are disabled in this journal

    default userpic