This section addresses CPU considerations in the guest operating system.
Side-Channel Vulnerability Mitigation in Guest Operating Systems
A class of security vulnerabilities collectively known as “side-channel vulnerabilities” have been discovered in many modern CPUs. These include vulnerabilities commonly called Spectre, Meltdown, Foreshadow, L1TF, and others. Mitigation for some side-channel vulnerabilities takes place in the guest operating system. These mitigations address serious security vulnerabilities, but they can also have a significant impact on performance, especially on systems that are CPU resource constrained.
Because of the complexity of this topic, as well as its ever-changing nature, it isn’t thoroughly addressed in this book. For the latest information about these vulnerabilities as they relate to VMware products, see the operating system-specific mitigations sections in VMware KB articles 52245, 54951, and 55636, all of which are updated regularly.
Virtual NUMA (vNUMA)
Virtual NUMA (vNUMA) exposes NUMA topology to the guest operating system, allowing NUMA-aware guest operating systems and applications to make the most efficient use of the underlying hardware’s NUMA architecture.
For more information about NUMA, see Non-Uniform Memory Access (NUMA).
Virtual NUMA, which requires virtual hardware version 8 or later, can in some cases provide significant performance benefits for wide virtual machines (as defined in Non-Uniform Memory Access (NUMA)), though the benefits depend heavily on the level of NUMA optimization in the guest operating system and applications.
You can obtain the maximum performance benefits from vNUMA if your clusters are composed entirely of hosts with matching NUMA architecture. The hosts in a single VMware Cloud on AWS clusters always have matching NUMA architecture, but when VMs are moved to a different VMware Cloud on AWS cluster, or moved to an on-premises ESXi host, the NUMA architecture could be different.
The benefit obtained from matching NUMA architecture is because the very first time a vNUMA-activated virtual machine is powered on, its vNUMA topology is set based in part on the NUMA topology of the underlying physical host on which it is running. By default, once a virtual machine’s vNUMA topology is initialized it doesn’t change unless the number of vCPUs in that virtual machine is changed. This means that if a vNUMA virtual machine is moved to a host with a different NUMA topology, the virtual machine’s vNUMA topology might no longer be optimal for the underlying physical NUMA topology, potentially resulting in reduced performance.
When sizing your virtual machines, take into account the size of the physical NUMA nodes:
For the best performance, try to size your virtual machines to stay within a physical NUMA node. For example, if you have a host system with six cores per NUMA node, try to size your virtual machines with no more than six vCPUs.
When a virtual machine needs to be larger than a single physical NUMA node, try to size it such that it can be split evenly across as few physical NUMA nodes as possible.
Use caution when creating a virtual machine that has a vCPU count that exceeds the physical processor core count on a host. Because hyper-threads are considered logical threads, this is sometimes permissible, but will potentially create contention when used.
Changing the corespersocket value doesn't influence vNUMA or the configuration of the vNUMA topology. The configuration of vSockets and corespersocket now only affects the presentation of the virtual processors to the guest OS, something potentially relevant for software licensing. vNUMA will automatically determine the proper vNUMA topology to present to the guest OS based on the underlying host.
This decoupling of the corespersocket setting from vNUMA allows vSphere to automatically determine the best vNUMA topology.
To disable this behavior and directly control the vNUMA topology, see the numa.vcpu.followcorespersocket setting in the Virtual NUMA Controls section of vSphere Resource Management.
Note:Although corespersocket no longer directly sets the vNUMA topology, some corespersocket values could result in sub-optimal guest OS topologies; that is, topologies that are not efficiently mapped to the physical NUMA nodes, potentially resulting in reduced performance.
For more information about this, see VMware KB article 81383 and the Virtual Machine vCPU and vNUMA Rightsizing – Guidelines blog post. In addition, the Virtual Machine Compute Optimizer tool can provide guidance on configuring optimal topologies.
By default, vNUMA is activated only for virtual machines with more than eight vCPUs. This feature can be activated for smaller virtual machines, however, while still allowing VMware Cloud on AWS to automatically manage the vNUMA topology. This can be useful for wide virtual machines (that is, virtual machines with more vCPUs than the number of cores in each physical NUMA node) with eight or fewer vCPUs.
To activate vNUMA for virtual machines with eight or fewer vCPUs, use the vSphere Client to set numa.vcpu.min to the minimum virtual machines size (in vCPUs) for which you want vNUMA activated as follows:
Right-click the virtual machine you wish to change, then select Edit Settings....
Under the VM Options tab, expand Advanced, then click EDIT CONFIGURATION....
Look for numa.vcpu.min and configure it as you wish. If the variable isn’t present, click ADD CONFIGURATION PARAMS and enter it as a new parameter.
Click OK to close the Configuration Parameters window.
Click OK to close the Edit Settings window.
Alternatively, you can take full manual control of a virtual machine’s vNUMA topology using the maxPerVirtualNode option. For more details, see the Virtual NUMA Controls section of vSphere Resource Management.
CPU Hot Add is a feature that allows the addition of vCPUs to a running virtual machine. Activating this feature, however, deactivates vNUMA for that virtual machine, resulting in the guest OS seeing a single vNUMA node. Without vNUMA support, the guest OS has no knowledge of the CPU and memory virtual topology of the host. This in turn could result in the guest OS making sub-optimal scheduling decisions, leading to reduced performance for applications running in large virtual machines.
For this reason, activate CPU Hot Add only if you expect to use it. Alternatively, plan to power down the virtual machine before adding vCPUs, or configure the virtual machine with the maximum number of vCPUs that might be needed by the workload. If choosing the latter option, note that unused vCPUs incur a small amount of unnecessary overhead. Unused vCPUs could also cause the guest OS to make poor scheduling decisions within the virtual machine, again with the potential for reduced performance.
For additional information see VMware KB article 2040375.