Nova provisions virtual compute instances on top of the SDDC infrastructure. Nova comprises a set of daemons running as Kubernetes Pods on top of the Tanzu Kubernetes cluster to provide the compute provisioning service.
Nova compute consists of the following daemon processes:
nova-api: Accepts and responds to end-user compute API requests such as VM boot, reset, resize, and so on.
nova-compute: Creates and terminates VM instances.
nova-scheduler: Takes a VM instance request from the queue and determines where compute must run it.
nova-conductor: Handles requests that need coordination and acts as a database proxy. Nova conductor communicates between Nova processes.
Nova Compute
Unlike a traditional KVM-based approach where each hypervisor is represented as a nova compute, the VMware vCenter driver activates the nova-compute service to communicate with a VMware vCenter Server instance.
The vCenter driver aggregates all ESXi hosts within each cluster and presents one large hypervisor to the nova scheduler. VIO deploys a nova-compute Pod for each vSphere ESXi cluster that it manages. Because individual ESXi hosts are not exposed to the nova scheduler, Nova scheduler assigns hypervisor compute hosts at granularity of the vSphere clusters. vCenter selects the ESXi host within the cluster based on the advanced DRS placement settings. Both automated and partially automated DRS are supported for standard VM workloads. DRS must be deactivated in the case of SR-IOV.
Nova Host Aggregates
A nova host aggregate is a grouping of hypervisors or nova computes. Groupings can be done based on the host hardware similarity. For example, clusters with SSD storage backing can be grouped into one aggregate and clusters with magnetic storage into another aggregate. If hardware attributes are similar, grouping can also be based on the physical location of the cluster in the form of availability zones. If there are N data centers, all ESXi clusters within a data center can be grouped into a single aggregate. An ESXi cluster can be in more than one host aggregate.
Host Aggregates provide a mechanism to allow administrators to assign key-value pairs, also called metadata, to compute groups. The nova scheduler can use this key-value pair and metadata to select the hardware that matches the client request. Host aggregates are visible only to administrators. Users consume aggregates based on the VM flavor definition and availability zone.
Nova Scheduler
VMware Integrated OpenStack uses the nova-scheduler service to determine where to place a new workload or a modification to an existing workload request, for example, during a live migration or when a new VM starts up. A nova-scheduler is simply a filter. Based on the type of request, it eliminates nova-computes that cannot achieve the workload request and returns those that can.
Nova scheduler also controls host CPU, memory, and disk over-subscription. Over-subscription places multiple devices to the same physical resource to optimize usage. Over-subscription can be defined based on the host aggregate. The following filters when activated controls aggregates-level over-subscription management:
AggregateCoreFilter: Filters hosts by CPU core numbers with a per-aggregate cpu_allocation_ratio value.
AggregateDiskFilter: Filters hosts by disk allocation with a per-aggregate disk_allocation_ratio value.
AggregateRamFilter: Filters hosts by RAM allocation of instances with a per-aggregate ram_allocation_ratio value.
If the per-aggregate value is not found, the value falls back to the global setting. If the host is in more than one aggregate and thus more than one value is found, the minimum value is used.
Instead of over-subscription, Cloud Administrators can assign dedicated compute hosts by OpenStack tenants. You can use the AggregateMultiTenancyIsolation filter to control VM placement based on the OpenStack tenant. In this context, Tenant is defined as an OpenStack project. If an aggregate has the filter_tenant_id metadata key, the hosts in the aggregate create instances only from that tenant or list of tenants. No other tenant is allowed on these hosts.
As stated in the Nova Compute section, individual ESXi hosts are not exposed to the nova scheduler. Therefore, ensure that the nova schedule filters align with the underlying vSphere resource allocation.
Nova Compute Scaling
As workloads increase, the cluster must be scaled to meet new capacity demands. While vCenter Server can have a maximum cluster size of 64 ESXi hosts, VIO cluster scaling varies depending on the use case. You can add new capacity to a VIO deployment in two ways:
Vertical scaling: Increase the number of hosts in a cluster.
Horizontal scaling: Deploy a new vCenter Server cluster and add the lcuster as a new nova-compute Pod to an existing or new nova compute aggregate.
The implementation of vertical or horizontal scaling must be based on the use case and the number of concurrent operations against the OpenStack API. The following table outlines the most frequently-used deployment scenarios:
Use Case |
Expected Parallel OpenStack Operations |
Scaling Model |
---|---|---|
Traditional Enterprise |
Low |
Horizontal or Vertical |
Direct API access to the infrastructure Example: CICD workflow |
High |
Horizontal |
Direct API access to the infrastructure Example: Terraform Automation workflow |
Low |
Horizontal or Vertical |
NFV deployment |
Low |
Horizontal or Vertical |
Cloud Native workload running on top of Kubernetes |
Low |
Horizontal or Vertical |
Nova Flavours and Compute Performance Tuning
To consume a nova host aggregate, cloud admins must create and expose VM offering so that users can request VMs that match their application vCPU, memory, and disk requirements. In OpenStack, a flavor represents various types of VM offerings.
An OpenStack flavor defines the compute, memory, and storage capacity of the computing instances.
In addition to capacity, a flavor such as SSD, spinning disks, CPU types, CPU family, and so on can also indicate the hardware profile. Hardware profiles are often implemented through flavor extra-specs.
Nova flavor extra-specs are key-value pairs that define which compute or host aggregate a flavor can run on. Based on extra-spec, the nova-scheduler locates the hardware that matches the corresponding key-value pairs on the compute node.
Data-plane intensive workloads require VM-level parameters for maximum performance. VM-level parameters can also be handled using nova flavors. The vCloud NFV Performance Tuning Guide outlines the recommended VM-level parameters to be set when deploying data-plane intensive workloads:
Virtual CPU Pinning
NUMA alignment
CPU/Memory reservation setting
Selective vCPU Pinning
Tenant VDC
Huge Page
Passthrough Networking
VM Parameter |
Flavor Metadata Category |
Metadata Values |
---|---|---|
CPU Pinning |
CPU Pinning policy |
hw:cpu_policy=dedicated |
VMware Policies |
vmware:latency_sensitivity_level=high. |
|
VMware Quota |
quota:cpu_reservation_percent and quota:memory_reservation_percent=100 |
|
Selective vCPU Pinning |
Custom |
vmware:latency_sensitivity_per_cpu_high="<cpu-id1>,<cpu-id2>" |
CPU Pinning policy |
hw:cpu_policy=dedicated |
|
VMware Quota |
quota:cpu_reservation_percent and quota:memory_reservation_percent=100 |
|
VMware Policies |
vmware:latency_sensitivity_level=high. |
|
NUMA |
VMware Policies |
numa.nodeAffinity="numa id" |
Huge Pages |
Guest Memory Backing |
hw:mem_page_size="size" |
VMware Quota |
quota:memory_reservation_percent=100 |
|
Tenant vDC |
VMware Policies |
vmware:tenant_vdc=UUID |
CPU Memory Reservation |
VMware Quota |
For more details, see Supported Flavor Extra Spec. |
Design Recommendation |
Design Justification |
Design Implication |
---|---|---|
Create nova host aggregates to group vSphere clusters sharing similar characteristics. |
Host Aggregates provide a mechanism to allow administrators to group compute clusters. |
None |
Assign key-value pairs to host aggregates based on the hardware profile and data center affinity. |
The nova scheduler uses the key-value pair and metadata to select the hardware that matches the client request. |
None |
Use AggregateFilter to control nova compute (vSphere cluster) over-subscription. |
Over-subscription leads to greater resource utilization efficiency. |
You must ensure that the over-subscription ratio does not conflict with the Tenant VDC reservation. Over-subscription can lead to degraded SLA. |
Use AggregateMultiTenancyIsolation filter to place VMs based on the tenant ID. |
Used when an entire cluster must be reserved for a specific OpenStack Tenant. |
Unbalanced resource consumption |
When using a tenant VDC, create a new default VDC with no reservation and map all default flavors to the new VDC. |
From a vSphere resource hierarchy perspective, VMs created using default flavors belong to the same resource hierarchy as Tenant VDC. If Tenant VDCs and VMs share the parent resource pool, it is not guaranteed that VMs with no resource reservation will not use CPU share from Tenant VDC with full reservation. Mapping default VIO flavors to a child VDC without resource resolution alleviates this issue. |
None |
Use OpenStack metadata extra specs to set VM-level parameters for data plane workloads in the compute flavor. |
Metadata extra specs translate to VM settings on vSphere. |
None |
When adding more ESXi resources, consider building new vSphere clusters instead of adding to the existing cluster in an environment with a large churn. |
New vSphere clusters introduce more parallelism when supporting a large number of concurrent API requests. |
New clusters introduce new objects to manage in the OpenStack database. |