Memory compression on hypervisor vs host
I've been wondering whether it's better for memory pages to be compressed at the hypervisor level, or on the VM level.
I'm leaning toward the VM level, because
1: VMs have better knowledge of memory pressure by the application, and can better decide when to swap pages out to zram. The VM has access to information about memory pages that the hypervisor doesn't have.
2: if pages are compressed on the hypervisor level, the VM doesn't "see" any increased memory available. The host box gains free memory, but the application never sees it to make use of it, it'll just see the same 8GB as it always has, so it never really benefits. This maybe lets you host more VMs on one box, but at the cost of the applications not being as efficient.
Is this a reasonable position? I'm wondering if I'm missing something obvious.
It depends on the application.
Do you have some apps that are inactive for long periods of time, and “wake up?” Better to do it at the highest level. That gives the OS power and o essentially shunt whole VMs away and give the active ones full power.
Are they all pretty active all the time? Are memory performance requirements not too high? Is latency a priority? Best to do it inside, I suppose.
EDIT: For what it’s worth, I found that no zram is best in some scenarios. Sometimes applications just barely, rarely scrape the memory limit, and if I enable a big chunk of zram they scrape it more frequently, then don’t give it up and keep active pages in zram. Rare swapping to an ssd ended up much, much faster.
Unless you are running at really large scales, or really small scales and trying to fit stuff that quite fit, memory compression may not be significant enough of an optimization to spend a lot of time experimenting a lot. But I'm bored and currently on an 8 GB device so here are my thoughts dumped out from my recent testing:
Zram vs Zswap (can be done at hypervisor or at host):
Kernel same page merging (KSM) (would be done at hypervisor level) (esxi also has an equivalent feature called something different):
In my opinion, the best thing is to enable zram or zswap at the virtual machine level and kernel same page merging at the hypervisor level, assuming you take into account and accept the marginal security risk and slightly weaker isolation that comes with KSM. There isn't any point running zswap at two layers, because the hypervisor is just gonna spend a lot of time trying to see if it can compress stuff that's already been compressed. Than KSM deduplicates memory across hosts. Although you may actually see worse savings overall if zram/zswap compression is only semi-deterministic and makes deduplication ahrder.
I agree with the other commenter as well about zram being weird with some workloads. Like I've heard of I think it was blender interacting weirdly with zram since zram is swap, making less total memory available in ram, whereas zswap compresses memory. If you really need to know you gotta test.
Hypervisor. The hypervisor doesn’t need to know much about application needs. It can perform compression/deduplication for the VM, and you can therefore add more memory to the VM and prevent it from using additional swap and CPU to perform memory management.
The other benefit if assigning more memory instead of using guest compression is that the hypervisor can use a memory ballon when it needs to reclaim memory, forcing the VM to decide what will stay in memory and what will be sent to swap.
The concepts of storage are similar. If you need to encrypt data at rest, it’s usually better to let the hypervisor or SAN handle it. Letting a VM perform storage encryption would work, but would eat up CPU cycles and prevent the hypervisor from performing compression, deduplication, and in some cases knowing what space is used but empty.
Storage compression is similar, you want the hypervisor to handle it since it can compress block of data that are the same across the environment. If you have 1000 machines all running the same OS with many of the same applications installed, then you’ll have a lot of opportunity to save space. You can use these same ideas when it comes to memory.