Memory Overcommit Techniques

This section describes the techniques VMware Cloud on AWS uses to allow memory overcommitment.

VMware Cloud on AWS uses four memory management mechanisms to dynamically reduce the amount of machine physical memory required for each virtual machine. These are page sharing, ballooning, memory compression, and swapping.

Page Sharing: VMware Cloud on AWS can use a proprietary technique to transparently share memory pages between virtual machines, thus eliminating redundant copies of memory pages. While pages are shared by default within virtual machines, pages are not shared by default between virtual machines for security reasons.
Ballooning: If the host memory begins to get low and the virtual machine’s memory usage approaches its memory target, VMware Cloud on AWS will use ballooning to reduce that virtual machine’s memory demands. Using a VMware-supplied vmmemctl module installed in the guest operating system as part of the VMware Tools suite, VMware Cloud on AWS can cause the guest operating system to relinquish the memory pages it considers least valuable. Ballooning provides performance closely matching that of a native system under similar memory constraints. To use ballooning, the guest operating system must be configured with sufficient swap space.
Memory Compression: If the virtual machine’s memory usage approaches the level at which host-level swapping will be required, VMware Cloud on AWS will use memory compression to reduce the number of memory pages it will need to swap out. Because the decompression latency is much smaller than the swap-in latency, compressing memory pages has significantly less impact on performance than swapping out those page
Host-Level Swapping: VMware Cloud on AWS will next forcibly reclaim memory from the virtual machine by swapping out pages to a swap file. Some of the swaped pages might be active, which can cause virtual machine performance to degrade significantly due to its high access latency. (Note that this swapping is distinct from the swapping that can occur within the virtual machine under the control of the guest operating system.)

For further information about memory management, see Understanding Memory Resource Management in VMware vSphere 5.0 (though this paper specifically addresses vSphere 5.0, most of the concepts are still applicable).

While VMware Cloud on AWS uses page sharing, ballooning, and memory compression to allow significant memory overcommitment, usually with little or no impact on performance, you should avoid overcommitting memory to the point that host-level swapping is used to swap out active memory pages.

If you suspect that memory overcommitment is beginning to affect the performance of a virtual machine you can take the following steps:

Note:

The point at which memory overcommitment begins to affect a workload’s performance depends heavily on the workload. The areas we describe in this section will be relevant for most workloads. For some of those workloads, however, performance will be noticeably impacted earlier than for others.

In the vSphere Client, select the virtual machine in question, select Monitor > Performance > Advanced, in the drop down at the upper right select Memory, then look at the value of Ballooned memory (Average).
An absence of ballooning suggests that the host is not under heavy memory pressure and thus memory overcommitment is not affecting the performance of that virtual machine.

Note:
This indicator is only meaningful if the balloon driver is installed in the virtual machine and is not prevented from working.

Some ballooning is quite normal and not indicative of a problem.
In the vSphere Client, select the virtual machine in question, select Monitor > Performance > Advanced, in the drop down at the upper right select Memory, then compare the values of Consumed memory and Active memory. If consumed is higher than active, this suggests that the guest is currently getting all the memory it requires for best performance.
In the vSphere Client, select the virtual machine in question, select Monitor > Utilization, expand the Guest Memory pane, then look at the values of Swapped and Compressed. Swapping and compressing at the host level indicate more significant memory pressure.
Check for guest operating system swap activity within that virtual machine.
This can indicate that ballooning might be starting to impact performance, though swap activity can also be related to other issues entirely within the guest operating system (or can be an indication that the guest memory size is simply too small).