Memory compression on hypervisor vs host

I've been wondering whether it's better for memory pages to be compressed at the hypervisor level, or on the VM level.

I'm leaning toward the VM level, because

1: VMs have better knowledge of memory pressure by the application, and can better decide when to swap pages out to zram. The VM has access to information about memory pages that the hypervisor doesn't have.

2: if pages are compressed on the hypervisor level, the VM doesn't "see" any increased memory available. The host box gains free memory, but the application never sees it to make use of it, it'll just see the same 8GB as it always has, so it never really benefits. This maybe lets you host more VMs on one box, but at the cost of the applications not being as efficient.

Is this a reasonable position? I'm wondering if I'm missing something obvious.

View original on lemmy.ca

Comments3

brucethemoose

lemmy.world

It depends on the application.

Do you have some apps that are inactive for long periods of time, and “wake up?” Better to do it at the highest level. That gives the OS power and o essentially shunt whole VMs away and give the active ones full power.

Are they all pretty active all the time? Are memory performance requirements not too high? Is latency a priority? Best to do it inside, I suppose.

EDIT: For what it’s worth, I found that no zram is best in some scenarios. Sometimes applications just barely, rarely scrape the memory limit, and if I enable a big chunk of zram they scrape it more frequently, then don’t give it up and keep active pages in zram. Rare swapping to an ssd ended up much, much faster.

moonpiedumplings

programming.dev

Unless you are running at really large scales, or really small scales and trying to fit stuff that quite fit, memory compression may not be significant enough of an optimization to spend a lot of time experimenting a lot. But I'm bored and currently on an 8 GB device so here are my thoughts dumped out from my recent testing:

Zram vs Zswap (can be done at hypervisor or at host):

One or the other is commonly enabled on many modern distros. It is a perfectly reasonable position to simply use the distro's defaults and not push it any further
Zram has much, much better compression, but suffers from LRU inversion. Essentially after zswap is full, fresh pages (memory) goes to the swap instead. Since these pages will probably be needed, it will be slower to get them from the disk then to get them from zram.
Zswap has much, much worse compression but cold, unused pages are moved to swap automatically, freeing up space
I am investigating ways to get around the above. See my thoughts on this and other differences here: https://github.com/moonpiedumplings/moonpiedumplings.github.io/blob/main/playground/asahi-setup/index.md#memory-optimization

Kernel same page merging (KSM) (would be done at hypervisor level) (esxi also has an equivalent feature called something different):

Only really efficient if you have lots of the same virtual machines
Used to overcommit (promise more ram than you physically have)
- Dangerous, but highly cost saving. Many cheap VPS providers do this in order to save money. You can run four 8 GB vps on 24 GB of ram and take a semi-safe bet that not all of the memory will be used.

In my opinion, the best thing is to enable zram or zswap at the virtual machine level and kernel same page merging at the hypervisor level, assuming you take into account and accept the marginal security risk and slightly weaker isolation that comes with KSM. There isn't any point running zswap at two layers, because the hypervisor is just gonna spend a lot of time trying to see if it can compress stuff that's already been compressed. Than KSM deduplicates memory across hosts. Although you may actually see worse savings overall if zram/zswap compression is only semi-deterministic and makes deduplication ahrder.

I agree with the other commenter as well about zram being weird with some workloads. Like I've heard of I think it was blender interacting weirdly with zram since zram is swap, making less total memory available in ram, whereas zswap compresses memory. If you really need to know you gotta test.

Brkdncr

lemmy.world

Hypervisor. The hypervisor doesn’t need to know much about application needs. It can perform compression/deduplication for the VM, and you can therefore add more memory to the VM and prevent it from using additional swap and CPU to perform memory management.

The other benefit if assigning more memory instead of using guest compression is that the hypervisor can use a memory ballon when it needs to reclaim memory, forcing the VM to decide what will stay in memory and what will be sent to swap.

The concepts of storage are similar. If you need to encrypt data at rest, it’s usually better to let the hypervisor or SAN handle it. Letting a VM perform storage encryption would work, but would eat up CPU cycles and prevent the hypervisor from performing compression, deduplication, and in some cases knowing what space is used but empty.

Storage compression is similar, you want the hypervisor to handle it since it can compress block of data that are the same across the environment. If you have 1000 machines all running the same OS with many of the same applications installed, then you’ll have a lot of opportunity to save space. You can use these same ideas when it comes to memory.