The boss man figured out something this morning.
If you at the borderline of over-committing memory for a Libvirt/KVM host, you might want to think about not using KSM.
We have various guests on the KVM host system, and started having issues with performance of MySQL driven sites on the guests.
With /sys/kernel/mm/ksm/run = 1, which we had (prior to 7:30 am) terrible response and high disk i/o and latency on the guests.
Turn it off in Ubuntu with:
/etc/default/qemu-kvm
and edit the appropriate line
or an echo 1 trick for a live system to test.
Take a look at the nice after performance:
So it looks like borderline over-subscribed memory is a bad place to hang out with KSM enabled on a KVM host system!
Might still be ok, but need to check on ksm tuning so not so much cpu on the host via:
/sys/kernel/mm/ksm/sleep_millisecs
to allow the system to be able to do useful work and not scan for pages so often.
Ok, so it pays to upgrade sometimes:
https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/578930
from the Kernel docs on KSM: http://www.kernel.org/doc/Documentation/vm/ksm.txt
Notice the disclaimer: Default: 20 (chosen for demonstration purposes)
pages_to_scan - how many present pages to scan before ksmd goes to sleep e.g. "echo 100 > /sys/kernel/mm/ksm/pages_to_scan" Default: 100 (chosen for demonstration purposes) sleep_millisecs - how many milliseconds ksmd should sleep before next scan e.g. "echo 20 > /sys/kernel/mm/ksm/sleep_millisecs" Default: 20 (chosen for demonstration purposes) run - set 0 to stop ksmd from running but keep merged pages, set 1 to run ksmd e.g. "echo 1 > /sys/kernel/mm/ksm/run", set 2 to stop ksmd and unmerge all pages currently merged, but leave mergeable areas registered for next run Default: 0 (must be changed to 1 to activate KSM, except if CONFIG_SYSFS is disabled)